Last Updated: May 27, 2021
It is recommended in Document Tables to use Field Type Singleline or Multiline for fields being extracted from the document, so that special characters in the fields, if any, are handled by the datatype.
Ensure that validations for the required fields are designed in the Post Processing task.
The status of the record is updated to MANUAL_INTERVENTION_VALIDATION_FAILED only if the designed validations are not met. And then the document can be re-familiarized.
When the document is processed for an existing category, and if no validations are designed for the extracted fields, the status of the processed document will be updated to EXTRACTED_SUCCESSFULLY even if it is blank or if it doesn’t meet the required validation rules (e.g. PO Number should be numerical only)
And you will not be able to re-familiarize the document.
Add all possible labels for the extracted field that may occur in that category of the document as Pseudonyms so that the field is extracted for any of those labels.
When the document received is a wrong template, choose the option of Manual Review and skip training the ML model, so that it is not considered in the category identification.
When familiarizing the table in the document ensure that you
If any handwritten text in the document being processed, use the Hand Written Text Extraction Node with the required mode, to handle the handwritten text as per the document processing requirements.