
Discover the 10 best ready-to-use AI agents in 2025. Deploy them in less than a week, boost workflows, and automate processes without coding.
Comparatives
Dernière mise à jour :
October 10, 2025
5 minutes
The world of OCR (Optical Character Recognition) and IDP (Intelligent Document Processing) is changing rapidly. For many, this technical vocabulary may seem complex, even though it is at the heart of modern document automation. This glossary presents 25 key definitions, ranging from the basics of OCR to advanced artificial intelligence building blocks, to help you better navigate the world of intelligent document management.
Understand the essentials of OCR and document automation: clear definitions, comparisons, and best practices. Enough to speed up your workflows and make your processes more reliable today.
OCR is the technology that makes it possible to convert text found in an image or PDF into usable digital data.
For example, it can automatically extract the number from an invoice or the expiration date of an identity card. OCR is the fundamental building block of document automation, because it makes information “readable” by a computer.
The handwritten recognition is a technology dedicated to the recognition of isolated handwritten characters. It is found, for example, in administrative or banking forms where you ask to write in capital letters, letter by letter, in boxes. This is a reliable approach in highly structured environments, but it is limited when it comes to cursive scripts or whole sentences.
THEICR is a more advanced evolution of UNHCR. It uses machine learning algorithms to recognize more complex types of writing, whether cursive or free handwritten. Unlike UNHCR, it can learn and improve through human corrections. For example, it is used to read handwritten notes, medical prescriptions, or notes on invoices.
OMR is a technology that detects the presence of visual marks on a document, such as checked boxes or filled circles. It is used in multiple choice questionnaires, paper surveys or even certain attendance sheets.
Computer vision is a field of artificial intelligence that allows machines to understand and analyze images and videos. It is the basis of many OCR applications, since it makes it possible to identify the structure of a document, to identify text areas or to differentiate text, tables and images.
DPI (dots per inch) measures the resolution of a scanned image. The higher the value, the more detail the image contains, which improves OCR accuracy.
In practice, a 300 DPI scan is often recommended for invoices or identity documents in order to obtain reliable extractions.
When a document is scanned crooked, the text lines are angled, which reduces the quality of the extraction. Deskew consists in automatically straightening the document so that OCR can work on an aligned basis. This pre-processing step is essential to avoid reading errors.
The CER is an indicator that measures the rate of recognition errors at the character level. For example, if an OCR regularly mistakes the uppercase “O” for the number “0", it increases the CER. The lower this indicator, the better the performance of the system.
WER works like CER, but at the level of whole words. It is often used to assess the quality of the transcription of a document or audio file. In professional use, a low WER is essential to guarantee reliable and usable extractions.
The confidence score is a score given by an OCR engine to estimate the reliability of the recognition of a character, word, or field. For example, if a “TTC Amount” field is extracted with 98% confidence, it is most likely correct.
The confidence threshold is the minimum value at which extracted data is considered acceptable. Below this threshold, the system may request a manual check. This makes it possible to combine automation and quality control.
The parsing is the process of analyzing a text in order to structure it and extract usable elements. In the context of OCR, this can mean identifying an amount in an invoice or a date in a contract, even if the format of the document varies.
The fuzzy matching allows you to compare two character strings even if they do not match exactly. For example, “Société Générale” and “Societe Generale” will be recognized as identical despite differences in accent or class. This approach is widely used for bank data reconciliation or KYC.
Tokens are the basic units of text, obtained after breaking down into words, subwords, or characters. Tokenization is a step prior to NLP, which allows language to be processed in a more structured form.
Lemmatization consists in bringing a word back to its original form (the lemma). For example, “ran” and “will run” become “run.” This allows AI systems to better understand the general meaning of a text without being disturbed by grammatical variations.
Word embedding is a technique that turns words into digital vectors. These representations allow machines to understand relationships between words, such as the proximity between “bill” and “payment.” Embeddings are used in modern NLP models to improve contextual understanding.
The IDP is a solution that combines OCR, AI, and NLP to extract, classify, and validate data from complex documents. Unlike OCR alone, it integrates business logic (for example: verifying that an invoice contains a valid VAT number) and allows large volumes of documents to be automatically processed.
The approach human in the Loop involves including human intervention in an automated process to correct or validate certain data. It is particularly useful when OCR encounters poor quality or atypical documents.
STP refers to complete automated processing, without any human intervention. It is highly sought after in financial processes (for example, automatic validation of a correctly formatted supplier invoice).
La RPA allows you to automate repetitive tasks using software robots. Combined with OCR and IDP, it can automate entire workflows: receipt of invoices, extraction, entry into the ERP, then automatic archiving.
Machine learning is a branch of AI that allows a system to learn from data and improve its performance over time. In OCR, it is used to improve character recognition or to adapt extraction to new document formats.
Deep learning is a subset of machine learning based on deep neural networks. It is particularly effective for complex tasks such as image recognition, the reading of handwritten texts or the contextual understanding of documents.
To better understand the differences between these two approaches, check out our article on the Machine Learning vs Deep Learning.
NLP includes techniques that allow machines to understand and analyze human language. Combined with OCR, it makes it possible to extract meaning from unstructured documents such as contracts or emails.
The recognition of named entities is an NLP technique that identifies specific elements in a text: names of people, dates, amounts, account numbers, etc. It is a key feature for automating KYC verification and regulatory compliance.
The LLM are AI models trained on huge volumes of text.
They are able to understand, summarize, or generate natural language. In the IDP, they provide an additional layer of intelligence, for example by making it possible to contextualize an extraction or to check the consistency of a document.
Move to document automation
With Koncile, automate your extractions, reduce errors and optimize your productivity in a few clicks thanks to AI OCR.
Resources
Discover the 10 best ready-to-use AI agents in 2025. Deploy them in less than a week, boost workflows, and automate processes without coding.
Comparatives
Discover the top 10 Document Capture & Data Extraction tools in 2025: use cases, pricing, and API integrations to streamline workflows.
Comparatives
Starting today, Koncile Lite offers document extraction at half the price of our Pro model, with faster turnaround times.
News