Sunday, September 22, 2024

Revolutionizing Text Extraction and Analysis with TestComplete’s OCR Feature

Optical Character Recognition:

Optical Character Recognition (OCR) is the process that converts an image of text into a machine-readable text format. For example, if you scan a form or a receipt, your computer saves the scan as an image file. You cannot use a text editor to edit, search, or count the words in the image file. However, you can use OCR to convert the image into a text document with its contents stored as text data.

Methods of OCR:

Image acquisition:

A scanner reads documents and converts them to binary data. The OCR software analyses the scanned image and classifies the light areas as background and the dark areas as text.

Preprocessing:

The OCR software first cleans the image and removes errors to prepare it for reading. These are some of its cleaning techniques:

  • Deskewing or tilting the scanned document slightly to fix alignment issues during the scan.
  • Despeckling or removing any digital image spots or smoothing the edges of text images.
  • Cleaning up boxes and lines in the image.
  • Script recognition for multi-language OCR technology

Text recognition:

The two main types of OCR algorithms or software processes that OCR software uses for text recognition are pattern matching and feature extraction.

Pattern matching:

Pattern matching works by isolating a character image called a glyph and comparing it with a similarly stored glyph. Pattern recognition works only if the stored glyph has a similar font and scale to the input glyph. This method works well with scanned images of documents that have been typed in a known font.

Feature extraction:

Feature extraction breaks down or decomposes the glyphs into features such as lines, closed loops, line direction, and line intersections. It then uses these features to find the best match or the nearest neighbor among its various stored glyphs.

Postprocessing:

After analysis, the system converts the extracted text data into a computerized file. Some OCR systems can create annotated PDF files that include both the before and after versions of the scanned document.

OCR In Test complete:

Installation and Setup:

TestComplete comes with built-in OCR capabilities, requiring no separate installation. Once TestComplete is installed, the OCR feature is readily available without any additional setup.

Operation Of TestComplete:

Typically, TestComplete identifies the windows and controls by attributes like class names, captions, and IDs. However, it is unable to access properties for some controls, especially if graphical elements like bitmaps or charts are rendered directly on the screen. Those kinds of elements are accessed by using OCR. TestComplete support for optical character recognition is implemented with the Google Vision API.

Vision API:

Vision API can detect and extract text from images. It has two features that support OCR.

1. Text detection-detects and extracts text from any image. For example, a photograph might contain a street sign or a traffic sign. The JSON includes the entire extracted string, as well as individual words, and their bounding boxes.

2. Document text detection- also extracts text from an image, but the response is optimized for dense text and documents. The JSON includes page, block, paragraph, word, and break information.

OCR Process in Testcomplete:

TestComplete can recognize the text of UI elements selected on screen as well as the text in images you capture from the screen or load from files. TestComplete sends the data to be recognized to the ocr.api.dev.smartbear.com web service by SmartBear. This web service forwards incoming requests to Google Vision API and transfers the recognition results back to TestComplete.

We can access the entire recognized text or individual text blocks or tabular data. If the recognized text belongs to a UI element, TestComplete locates that element on screen by its text and simulates various actions on it, for instance, clicks or touches.

Steps To Extract Text In TestComplete:

  1. Using drag the target to point to the object or point and fix options to drag the OCR action from the operation tab into the tested application.

2. Extract text from the tested application by specifying the target

3. Select the needed text and that text appears on the substring to work tab and give the preference for the text.

4. Stimulate the action that is needed to perform the action on a particular text and click the finish button

6. Based on the action it acted in the tested application

OCR Checkpoint:

TestComplete’s OCR checkpoint verifies the recognized text content against the expected pattern. This is useful for testing applications where TestComplete cannot directly access internal properties, such as graphical objects rendered on the screen.

Pros and Cons of OCR:

Pros:

1. OCR Action operation provides flexibility to select the desired text fragment among multiple recognized fragments.

2. Iterating through all recognized text blocks enables detailed analysis.

3. Checking properties of each block allows validation of text content, boundaries, and more.

Cons:

1. OCR is not supported in mobile tests running in remote device clouds (Appium-based tests).

2. Recognition results may sometimes differ or fail, requiring additional validation and error handling.

Conclusion:

In conclusion, Testcomplete OCR technology has redefined the landscape of text recognition. Offering flexibility, detailed analysis, and validation capabilities, it opens avenues for automated analysis, indexing, and extraction of textual information.

No comments:

Post a Comment