Archive for the ‘ocr’ Category

One more step in OCR with OCRFeeder 0.7

Friday, July 30th, 2010

I have been hacking on some new and cool features on OCRFeeder for a while and now it is time to show them to the world in a new release.

These features I’m talking about fall mainly in 2 areas: improving the a11y of the UI and improving the recognition of documents.

A11y Improvement

The improvement of the a11y has the typical UI changes to include mnemonics, missing labels and relations, but also other approaches that have more to do with UX like using a progress dialog to inform users that time-taking operations are being carried. This means that now, the PDF importation and OCR won’t block the UI.
Other changes in this category were the navigation through the content boxes (before, these could only be selected by clicking on them), the selection of all boxes and the deletion of selected boxes.

The following screenshot shows the box editor area of OCRFeeder with its mnemonics highlighted:

Box edition area

Box edition area

Recognition Improvements

Sometimes, text columns are so close to each other that they end up being recognized as a single paragraph, so I added a post-detection method to solve this issue. This feature is optional and can be toggled from the Preferences dialog.

Here’s an example of the difference it makes:

Before columns' detection improvements

Before columns' detection improvements

After columns' detection improvements

After columns' detection improvements

Scanned document images are usually skewed and this makes it more difficult for the contents to be successfully detected and “OCRed”. I decided to implement an algorithm to deskew these images. The algorithm uses the Hough transform to try to find lines in the image and their angles and, while it is a bit slow, it works well:

Skewed image

Skewed image

Deskewed image

Deskewed image

This action can be used in a loaded image but can also be configured to be automatically performed before the images are added. The Unpaper tool can now also be set to be clean images before adding them.
This makes it much easier to successfully recognize images obtained from a scanner device.

Some fine tunning of the content boxes’ bounds was done by trying to shorten their margins, that is, lowering the distance between the boxes and their actual contents.

The font size recognition was also tweaked to solve the problem of having paragraphs with initials (you know, the huge starting characters) which were influencing the whole paragraphs’ font size.

To finish the recognition’s improvements, I have added an optional action to find and fix the text’s line breaks. Usually, OCR engines don’t consider “semantic line-breaks”, that is, OCR engines always insert a newline in the end of each line.
Using some regular expressions, I try to find these “fake” line-breaks and recover the original flow of the text. Like some of the features mentioned above, this one can also be turned on/off from the Preferences dialog.

Here’s how the Preferences dialog looks like now:

Preferences_dialog

Preferences_dialog_recognition

To finish, images can now be dragged and dropped onto the pages’ area and the mouse wheel can be used to scroll horizontally combining it with the Shift key, thanks to Stefan Löffler, and of course, several bugs were corrected and code was improved.

As you see, this is a “rich” new version of OCRFeeder that keeps being the easiest way to use OCR in a desktop. You are welcome to file bugs in bugzilla or to send patches and features’ requests to its mailing list or approaching me if you’re in GUADEC.

Download: OCRFeeder 0.7 tarball on GNOME FTP

OCRFeeder 0.6.6

Monday, April 5th, 2010

OCRFeeder version 0.6.6 has been released.

This version has no big improvements and exists mainly to introduce the fix of a bug that prevented using the algorithm for recognizing documents automatically.

The copyright was updated to include the proper copyright and license notices of ODFPy, which ships with OCRFeeder.
It also features some improvements to Debian related files (thanks to Alberto Garcia, who is creating the official deb package for Debian) and a few translation updates.

See the whole list of changes here.

Your usual links:
OCRFeeder’s git
OCRFeeder’s bugzilla
OCRFeeder’s Tarball from GNOME’s FTP
OCRFeeder 0.6.6 Debian package

OCRFeeder version 0.6.5

Wednesday, March 24th, 2010

I have just released OCRFeeder version 0.6.5!

Here are the main changes in this version:

* Importing PDF files is now faster
* The OCR engines manager dialog now allows to detect and choose to use system-wide OCR engines (this action is also used when the application is started with no engines configured)
* Multiple content areas in OCRFeeder’s canvas can now be selected using Shift+Click
* Introduces Ctrl+a shortcut to select all content areas in OCRFeeder’s canvas
* The Tools menu now has the new action “Recognize Selected Areas” which will perform the automatic recognition on selected content areas of OCRFeeder’s canvas

Also, a few bugs were fixed:

* Removed PDF files’ extension from the images generated from them
* Sorts images when adding them from a folder
* Selection areas are now getting selected after creating them
* Fixed problem when quitting the application

(You can also read the full list of changes)

Recognize All Areas action

Recognize All Areas action

You can download the new tarball from GNOME’s FTP or a Debian package from here.

I’d also would like to thank the GNOME i18n Team for their work translating OCRFeeder.

OCRFeeder is now on GNOME Bugzilla

Monday, March 8th, 2010

OCRFeeder is now a product on GNOME Bugzilla and it should now be used for filing new issues. OCRFeeder Google Project’s bugtracker should be abandoned then.

So if you have been using OCRFeeder and found some issues or think it’s missing a great feature, go to the following URL and file a new bug:
https://bugzilla.gnome.org/enter_bug.cgi?product=ocrfeeder

Thank you for helping the GNOME’s OCR application.

OCRFeeder version 0.6.1 released

Sunday, March 7th, 2010

As has become usual every couple of weeks or so, I released a new version of OCRFeeder!

This is version 0.6.1 and the main changes this time are:

* Now you can increase or decrease the zoom using Ctrl+Mouse wheel. This kind of shortcut is well known in many GNOME applications and even I was missing it;
* Warning dialogs are now shown when something went wrong while opening an image;
* Fixed encoding problem when reading non-ASCII characters;
* Fixed error when configuring a new engine;
* Improved Debian package’s files (thanks to Alberto Garcia)
* Fixed zoom issues (sometimes the allowed zoom would not be consistent among tries);

It was a good week on OCRFeeder’s bug tracker, specially thanks to user Hank who reported important problems.

I am really glad about how OCRFeeder is turning out and I expect to make it even better with the help of its users, either by sending suggestions, reporting bugs or simply by using it you will be helping the project.

You can download OCRFeeder 0.6.1 tarball from GNOME FTP or optionally download a Debian package directly.