Use Content Intelligence Services (CIS) to analyze the textual content of documents and know what the documents are about without having to read them.
By default, CIS analyzes the content of the documents, including the values of the file properties. You can change the default behavior and have CIS analyze the values of the Documentum object attributes in addition to, or instead of, the content of the documents.
CIS performs several types of analysis:
Categorization: By detecting predefined keywords in the document content, the categorization identifies the category to which a document belongs. Categories for a subject area are organized in a structure called a taxonomy. Categorization enables you to organize content in a logical and consistent way.
Entity detection: It relies on Natural Language Processing (NLP). Named entities are detected by performing a semantic analysis of their context. If there is too little context, or if the context is unclear, the detection can seem to be incomplete. You can use entity detection to find named entities such as people names or company names in documents.
Pattern detection: Some pieces of information always have the same form. Use the pattern detection to retrieve this information when it is disseminated in text. For example, email addresses, because they comply with a standard, can be extracted using pattern detection. Pattern detection retrieves all pieces of information that match the pattern.
Related topics: