Wróć
Search
Products
Artykuły
en
ro pl

Advanced OCR/OCV Algorithms: Tackling Diverse Fonts and Handwriting

Advanced OCR (Optical Character Recognition) and OCV (Optical Character Verification) algorithms must be robust enough to handle various font types. A much more challenging task is recognizing human handwriting, which, unlike machine-printed text, is not standardized.

OCR is a general term for methods that convert raster images of machine-printed, handwritten, or printed text into electronic information. It’s a dynamically developing branch of optical image analysis. Currently, more and more new libraries and ready-made software are emerging, boasting immense functionality. OCR software has no problem recognizing scans of printed texts or photos taken with a camera. The algorithms also handle characters unique to certain languages. The biggest ongoing challenge, which is increasingly being overcome, is handwriting recognition. Unlike machine-printed text, where specific fonts are standardized, each person’s handwriting differs, requiring algorithms to be prepared for many variables.

OCR and OCV technologies differ by just one word. The latter stands for “Optical Character Verification.” While OCR software is used to read a text string, even from distorted or damaged images, OCV checks the correctness of information, as well as the quality and readability of the text. This method is most commonly used to confirm whether printed codes, dates, and serial numbers are sufficiently clear and legible for the user.

Even recognizing scanned documents or PDF files is not an easy task, yet OCR systems are very often used to recognize images acquired by cameras. This requires algorithms to incorporate many improvements, making them immune to defects arising during acquisition. Images from cameras often suffer from uneven illumination (e.g., darker corners), distortion, or noise due to inadequate lighting. Therefore, the software must not only recognize characters but also first eliminate the effects caused by such acquisition.


Industrial OCR Systems

Industrial character recognition systems are primarily used for interpreting and locating various serial numbers on packaging, checking the legibility of printed dates, comparing read character strings with expected ones, and calculating statistics. With minor modifications, OCR vision systems can also be used for tasks like reading vehicle license plates. Although many methods are employed for character recognition, not all of them can be used in industry. This is because this field uniquely demands reliability and the ability to distinguish characters from other objects, which cannot be guaranteed by all available solutions.


Diagram of a vision system performing optical character recognition.

Industrial character recognition can be divided into 5 main stages:

  • Image Acquisition: This is a crucial stage in visual text processing, as the quality of the captured image determines how much work the algorithms will need to perform. Any distortions and defects arising at this stage must later be compensated by the software. If not, in extreme cases, the text might be read with errors, or the operation might not be possible at all. Image acquisition is most often done using a monochrome camera, because color rarely carries valuable information, and only some algorithms can reconstruct colored text. It’s also important that the image has an appropriate number of pixels per character. On the other hand, if the image has very high resolution, it might affect calculation time, as it would require additional scaling.
  • Pre-processing: This involves all kinds of operations aimed at sharpening, filtering, and normalizing the image. Underexposed or noisy images can be dealt with at the pre-processing stage. Image binarization is also performed here. After these operations, the expected output is clear text on a uniform, white background.
  • Character Segmentation: Algorithms recognize each character individually, so to make this possible, the input image must be divided into segments, each containing a single element. First, the image is divided into text, tables, graphics, etc., then the text is divided into paragraphs and character strings, and finally into individual characters. It is important that each image fragment is assigned information about its location within the overall structure. This will enable later reassembly of the text.
  • Character Recognition: This is the most important stage of the entire process. Depending on the software manufacturer, the algorithms used may vary. Two main methods are classified:
    • Pattern Recognition: The algorithm compares each individual character to patterns stored in memory and makes a decision based on similarity.
    • Feature Recognition: Characteristic structures within characters are searched for, and depending on the algorithm, characters can be broken down into individual elements, with recognition performed based on their analysis.
  • Post-processing: The reassembly of individual characters into a complete text. Here, algorithms also check if the work has been performed correctly. Features used to determine correctness might include spelling, characters at the beginning and end, or whether all characters have been recognized. If any irregularities occur, the algorithm reports an error. The most undesirable situation is when incorrect data is treated as correct.

Neural Networks in OCR Service

Algorithms utilizing neural networks have burst into the world of technology.

Their main advantage is dynamic learning without the need for programmer

intervention, which greatly simplifies the entire

operation. This approach has

already made many solutions more flexible and, above all, more reliable. A largepart of OCR/OCV algorithms base their operation on this mechanism. This allows the software to constantly “learn” from observed images and thereby increase its effectiveness.

Applications

OCR/OCV are very common tasks for vision systems, both in industry and other fields. The use of OCR software, widely known to the public, includes all kinds of computer programs for converting PDFs and scanned documents into editable text. In such applications, the OCR algorithm simply needs to recognize the input image, whatever may be on it. This means the software must be ready for any situation: lowercase and uppercase letters, different fonts, tables, etc. Another type of application is the aforementioned industrial applications. Here, the given vision system is most often set for a specific operation. Examples include reading dates from packaging, checking the correctness of printed inscriptions, or digitizing shipping addresses, which are successfully used in pharmaceuticals, automotive, electronics, food industries, and many others. Industrial applications usually require a specific operation, unlike software used for recognizing PDFs and scanned texts. This allows for better optimization and reliability.

Machine vision, utilizing OCR, is also used for digitizing collections in libraries, as well as entering business documents and business cards. Another interesting application is automatic license plate recognition (ALPR). This operation can be used in processing images from speed cameras. The speed camera’s camera takes a picture of the vehicle, which is then sent to a central unit where the OCR algorithm recognizes the license plate, based on which the car owner is found. This system is a great example of a distributed machine vision system.


OCR/OCV technologies are yet another example of the many uses of machine vision. This confirms the thesis about the infinite possibilities of utilizing what might seem like a simple setup, consisting of a camera, a computer, and software. Avicon, as an integrator and distributor of vision systems, undertakes tasks in the field of optical character recognition and optical verification. Our machine vision solutions are found in many factories and laboratories in Poland and abroad.