Archive for the ‘ocrfeeder’ Category

OCRFeeder for GNOME’s Google Summer of Code

Wednesday, April 6th, 2011

I have added OCRFeeder to the list of GNOME ideas for this year’s Google Summer of Code.

I wrote some of the ideas that came to my mind at that moment but suggestions are welcome:
* Support for undo/redo
* Support for a defined language for documents that also configures the used OCR engine
* Implement reading order support
* Improve the generation of ODT files (currently the text is put as text boxes in the generated docs, maybe an option to set it as actual paragraphs would be better)
* Create a plugins system to process the results
* Improve the contents detection algorithm
* Create an assistant-like user mode (like 3 steps to scan and generate a document)

I don’t know if the apps for GNOME’s GSoC should be official modules, something that OCRFeeder is not yet but I think it is an interesting project for GSoC nonetheless.
As for mentoring it, I’m taking some weeks off in June and August but maybe I could still do it.

Let’s see how it turns out…

OCRFeeder 0.7.4 released

Sunday, March 20th, 2011

After more than two months since the last OCRFeeder release (I’ve been busy with other projects in Igalia), I have just released the version 0.7.4.

The highlights for this new version are:

Add plain text exportation

Sometimes users just want to get the plain text from a scanned document and this is especially useful for visually impaired users that will be able to read the text files with Orca afterwards.
This feature was developed by Andrew McGrath, a student from New Hampshire, who is contributing to OCRFeeder as part of his involvement in the Software for Humanity project. It would be great to have more colleges involved in this kind of initiatives.

Recognize the current page or the whole document

Now it is possible to automatically recognize the current page (as it did before) or the complete document. I’ve also added a confirmation dialog before the recognition is performed when there are changes in the project.

Thanks also to Joanmarie for all the great suggestions like this one and to Juanje Ojeda for the patches he sent me.

These were just the two main features I’ve picked from the list of changes, to view them all check the NEWS file.

Stay tuned for more improvements in the future.

Source tarball
Git
Bugzilla

OCRFeeder 0.7.3 released

Monday, January 3rd, 2011

I’ve just released OCRFeeder 0.7.3.

This first version of 2011 doesn’t introduce as many features as the previous ones but fixes a few issues and introduces user documentation (F1 help).

I’ve also made a change that is against my principles: the use of autotools.
Having things like distutils, which I was using in OCRFeeder, I always preferred to avoid using autotools in my Python projects but that also meant I’d have to be extra careful where it would install things to be consistent with other GNOME apps. This release’s main feature is the introduction of user documentation, so, it seemed like the right time to replace distutils with autotools.
But let’s see how long it lasts, things like BuildDj seem very interesting and the way to go once it is developed…

Thanks to the people who have filed bugs and sent me comments about OCRFeeder.

Source tarball
Git
Bugzilla

OCRFeeder version 0.7.1a released

Tuesday, November 9th, 2010

The 0.7.1a version of OCRFeeder has been released.

This version introduces some tasks performed by Emergya as part of the GuadaLinfo Accessible project, such as:
* Importation from a scanner device.
* Copying text from the content boxes to the clipboard.
* Users can now use the typical spell-checker dialog to correct mistakes in the text recognized by the OCR engines.

Other highlights include:

* Rewritten ocrfeeder-cli (which also introduces a help method now)
* Added the automatic detection of the Cuneiform OCR engine
* Move the OCRFeeder modules to its own folder (so it is better organized and doesn’t conflict with other modules when installing it)

And some bug fixing:

* Add the help option to ocrfeeder-cli (gb#630829)
* Fix selecting all areas
* Fix ellipsis and title in the queued events dialog
* Prevent “invisible” boxes creation
* Remove temporary images for the Tesseract OCR engine

A big thanks to the great GNOME translators for keeping OCRFeeder available in a number of languages and to Berto for making it available in Debian (which later got into Ubuntu as well).

Just as I was releasing the 0.7.1a version I realized the spell-checker.ui file was not being installed so I quickly did a tiny release, hence the 0.7.1a and not simply 0.7.1.

Download OCRFeeder 0.7.1a source tarball

One more step in OCR with OCRFeeder 0.7

Friday, July 30th, 2010

I have been hacking on some new and cool features on OCRFeeder for a while and now it is time to show them to the world in a new release.

These features I’m talking about fall mainly in 2 areas: improving the a11y of the UI and improving the recognition of documents.

A11y Improvement

The improvement of the a11y has the typical UI changes to include mnemonics, missing labels and relations, but also other approaches that have more to do with UX like using a progress dialog to inform users that time-taking operations are being carried. This means that now, the PDF importation and OCR won’t block the UI.
Other changes in this category were the navigation through the content boxes (before, these could only be selected by clicking on them), the selection of all boxes and the deletion of selected boxes.

The following screenshot shows the box editor area of OCRFeeder with its mnemonics highlighted:

Box edition area

Box edition area

Recognition Improvements

Sometimes, text columns are so close to each other that they end up being recognized as a single paragraph, so I added a post-detection method to solve this issue. This feature is optional and can be toggled from the Preferences dialog.

Here’s an example of the difference it makes:

Before columns' detection improvements

Before columns' detection improvements

After columns' detection improvements

After columns' detection improvements

Scanned document images are usually skewed and this makes it more difficult for the contents to be successfully detected and “OCRed”. I decided to implement an algorithm to deskew these images. The algorithm uses the Hough transform to try to find lines in the image and their angles and, while it is a bit slow, it works well:

Skewed image

Skewed image

Deskewed image

Deskewed image

This action can be used in a loaded image but can also be configured to be automatically performed before the images are added. The Unpaper tool can now also be set to be clean images before adding them.
This makes it much easier to successfully recognize images obtained from a scanner device.

Some fine tunning of the content boxes’ bounds was done by trying to shorten their margins, that is, lowering the distance between the boxes and their actual contents.

The font size recognition was also tweaked to solve the problem of having paragraphs with initials (you know, the huge starting characters) which were influencing the whole paragraphs’ font size.

To finish the recognition’s improvements, I have added an optional action to find and fix the text’s line breaks. Usually, OCR engines don’t consider “semantic line-breaks”, that is, OCR engines always insert a newline in the end of each line.
Using some regular expressions, I try to find these “fake” line-breaks and recover the original flow of the text. Like some of the features mentioned above, this one can also be turned on/off from the Preferences dialog.

Here’s how the Preferences dialog looks like now:

Preferences_dialog

Preferences_dialog_recognition

To finish, images can now be dragged and dropped onto the pages’ area and the mouse wheel can be used to scroll horizontally combining it with the Shift key, thanks to Stefan Löffler, and of course, several bugs were corrected and code was improved.

As you see, this is a “rich” new version of OCRFeeder that keeps being the easiest way to use OCR in a desktop. You are welcome to file bugs in bugzilla or to send patches and features’ requests to its mailing list or approaching me if you’re in GUADEC.

Download: OCRFeeder 0.7 tarball on GNOME FTP