Documents are submitted to the CIS server by batches called document sets. A document set is a collection of documents that are sent to the CIS server together, and which CIS server processes in the same way. Each document set has a definition to identify the documents applicable for this document set and a scheduling to determine when or how often the documents are processed.
The document sets can also have a configuration associated that defines the analysis performed (categorization, entity detection, and/or pattern detection). In this case, the analysis results are stored as annotations.
In an xCP environment, document sets are automatically created when associating discovered metadata to a content model. The document set is specific to a content model: analyzing a parent content model does not imply the analysis of inherited content models. The CIS server uses the document set to identify and process all instances of the model. If the content model is modified, the document set is updated during the redeployment. If all discovered metadata attributes are removed from the content model, the document set is deleted from the repository. All document sets defined for xCP are automatically scheduled to process content objects at regular intervals. This interval introduces a latency between a content object change and the update of its discovered metadata.
For classic categorization, create the document sets manually in Documentum Administrator. By default, the schedule of the document set is inactive. Define a schedule to run the document set automatically. Each time the document set runs, it submits only new or revised documents to CIS server.
Related topics: