Sunday, March 8, 2015

Recognize Only White Listed Characters & New API for Viewing Filtered Images

What’s new in this release?

We are pleased to announce the release of Aspose.OCR for Java 2.2.0. Aspose.OCR for Java API has exposed the Whitelist property for the OcrConfig class in order to provide the feature of Character White Listing. Now the developers can add a list of characters for recognition purposes. When Whitelist property is not null or empty, the OcrEngine recognizes only the specified characters while skipping everything else on the input image. The list of characters can be added to the OcrConfig class using its Whitelist property to enable this feature. Aspose.OCR for Java API has now exposed the PreprocessedImages class that allows the users to see how original input image is transformed during the OCR pre-processing steps.PreprocessedImages class has some useful properties that can retrieve the image at any particular stage of the OCR pre-processing.  Aspose.OCR for Java API allows to extract the text either as a whole or by parts whereas each part has its own Style, Font, Text Size and Location in the image, and all of this information can easily be extracted using the IRecognizedPartInfo and IRecognizedTextPartInfo class. The API also allows retrieving the hierarchy of each recognized part on the image as TextBlock, Line, Word or Character. Hierarchy of the recognized part can be extracted using the TextPartLevel class that has been exposed to the public API with this release. The IRecognizedTextPartInfo class has two useful properties such as Level and Children that provides the access to the hierarchical level and lower level textual parts respectively. This release includes plenty of new & enhanced features as listed below
  • Support text parts hierarchy
  • Character Whitelisting
  • Create API for viewing filtered images
  • Embed resources file into OCR jar
  • Support for French and Spanish languages
  • Text blocks detection
  • Improve working with languages through public API
  • Incorrect number recognized
  • Improve transparent images processing
  • Support of Spanish language
  • hangs on performing OCR on an animated GIF
  • ClassNotFoundException: com.aspose.omr.ChoiceBoxElement when Aspose.OMR used in Applet and rendered via AppletViewer
  • Aspose.OCR license does not work with Aspose.OMR
  • Latest version cannot correctly perform OCR on the sample provided with Aspose Examples Dashboard
  • Incorrect recognition of numbers
  • IRecognizedTextPartInfo does not return the found part type
  • Improve text and picture blocks processing algorithm
  • Incorrect results returned by OCR
  • Improve time taken to extract the text from an image
  • Unable to perform OCR on Arial 32pt TextOCR-33820 Engine.Text.PartsInfo[].CharactersBox isn't set
  • OcrException: "Error occurred during recognition." "Index was out of range. Must be non-negative and less than the size of the collection."
  • OutOfMemoryException
Overview: Aspose.OCR for Java

Aspose.OCR for Java is a character recognition component that allows developers to add OCR functionality in their Java web applications, web services and Windows applications. It provides a simple set of classes for controlling character recognition tasks. It helps developers to work with image files from within their Java applications. It allows developers to extract text from images, Read font, style information quickly, saving time & effort involved in developing an OCR solution from scratch.

1 comment:

  1. Hi, thanks for your great post,,,it's really a good .net ocr api c# ocr sdk.

    ReplyDelete