OCRFeeder 0.7.8

That’s right, one more release of OCRFeeder. If you’re wondering why so much time for apparently so little changes, it has to do with some super cool things I’ve been working on at Igalia, but you’ll know about that really soon.

This new release brings a few bug fixes such as:
* Fix recognition after using the Unpaper tool;
* Fix an Unpaper issue due to an nonexistent variable
* Prevent the version of Tesseract OCR engine from appearing in the recognized text

This last issue happened after an update in Tesseract which made it print “Tesseract Open Source OCR Engine v3.02 with Leptonica” to the standard output. Since the default way that the Tesseract engine is configured wasn’t discarding the text printed to the standard output, it would appear as part of the recognized text.

After a bit of discussion in the bug report, the conclusion was that OCRFeeder needed a way to detect the changes in the OCR engines’ configuration. This means this new version includes a way to check the needs for these updating the configuration and will warn the user about it once (on start-up). If it can update the engines’ configuration automatically it will say so and ask for confirmation, otherwise it will ask the user to change it manually and offer a way to open the OCR Engines Manager dialog.
The pictures below show what I just wrote:

OCRFeeder warnings

OCRFeeder warnings

(note that the first time you use this new version and since this feature wasn’t extensively tested, it might warn you even for engines that do not need a change; still, if it happens, it’ll be only once)

To see the entire list of changes and the amazing work of the GNOME i18n team, check out the NEWS file.

Source Tarball
Git
Bugzilla

2 thoughts on “OCRFeeder 0.7.8

  1. Joaquim

    Regarding this correction:
    Fix Unpaper issue due to an unexisting variable gb#668027 There was a temporary_dir varibale that was left unchanged when the configuration started using TEMPORARY_FOLDER. Thanks to Buganini Q for reporting and posting the fix.

    It would not be good to test the error return in callback routine, like it is done for deskew and return if set? I did not see if it is processed by callers returned, when debugging for rediscovering this bug. If not, printing a message, would speed pinpoint errors in unpaper, like conversion errors.

    For instance, upon instrumenting, I found a situation that the conversion of PNG 8bit to PBM fails in unpaperImage:

    imagePreprocessingFinishdedCb
    cannot write mode LA as PPM

    Without the error processing, it will trigger the same error message.

    Any way, congratulations for the great work.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>