OCRFeeder 0.8.1

Taking advantage of the holidays, I have been dedicating some time to my side projects so today I am giving you OCRFeeder version 0.8.1!

The last OCRFeeder version had a very important change which was the port to GObject introspection and I was already expecting a few bugs to pop up here and there. That proved to be true and so this version is mainly about bug fixing.
Specifically there was an issue related to GDK’s threads which caused the application to abort. Besides that, exporting a document or saving/loading a project was not working correctly due to unicode issues (because Python is very nice but working with unicode is sometimes more annoying than it should be, at least in versions prior to Python 3).
Anyway, all that should be working correctly now!

Besides squashing bugs, I also made some long due changes: made the Preferences dialog smaller (by adding its contents to a scrolled window) and migrated the application and engines’ settings to the XDG user configuration folder as opposed to .ocrfeeder.
Yes, I know that I should be using GSettings for the application’s settings by now but there were more critical changes to be done.
Besides a small change in the widgets that set a box’s type (from a radio button style to a non-indicator, grouped pair of buttons), there are no other UI changes but I really like how much more polished OCRFeeder seems with the nice recent GTK+ styles.

ocrfeeder-0.8.1-screenshot

Future

I have a number of ideas to make the application better not only in terms of UI/UX but also in terms of features. The detection algorithm hasn’t been touched for years and I am sure it can be improved not only in terms of performance but also in terms of accuracy.
One cool feature I’d love to see implemented is to have a quick way of translating a document’s contents. This would be helpful e.g. to users living abroad who might need to translate letters to a language they speak.
Nonetheless, as mentioned in my previous post about OCRFeeder, it is indeed not easy to find the time and motivation to dedicate to the project these days with all the work, life and other side projects so I don’t know when I will have time for it again. In that regard, if you want to give me a hand, you’d make me very happy as there is a lot of work to be done.

Happy holidays everyone!

Source tarball
Git
Bugzilla

14 thoughts on “OCRFeeder 0.8.1

  1. Thank you for all your work.
    Just one problem : ocrfeeder can not find my scanner, even when it is connected to a USB port.
    Is there a configuration file to complete ?

    Merry Christmas and Happy New Year celebrations

  2. Thank you for this update, I really was looking forward to it 🙂
    I recently skipped through the great Python and Gtk tutorial and getting more familiar with this so maybe I can contribute to this project at some time.
    But I’ll give the new version a try very soon!

  3. Hi Jychanel,

    About the scanner, it should work if the device is detected by libsane. Can the SimpleScan application detect your scanner, by the way?

    Cheers,

  4. Thank you for your reply.
    Simple Scan Detect both my two scanners, an old Agfa SnapScan 1212 USB and a HP Photosmart 5520 (wifi). I have libsane 1.0.23-3ubuntu3.1.
    Any idea ?

    Merry Christmas for you and your family.

  5. Thanks for this wonderful software. I’m a heavy Abbyy FineReader user but I want to switch completely to Linux. The only thing is that I can’t find any decent GUI OCR software for Linux, except this, that is developed…. from time to time….

    Please, for the non-programmers guys like me, can you provide some PPA or DEB file to test this version of OCRFeeder?

    Thank you, and I wish you a Happy Christmas.

  6. Hi Zohozer,

    Unfortunately I don’t have any PPA for you. Debian has up to date packages of OCRFeeder and Ubuntu usually also does too, although with a bit of a delay. For Ubuntu, I see that they don’t have the 0.8 version but it had some annoying bugs which 0.8.1 fixes, so hopefully they might include the packages for the latter.

  7. J’utilise OCRFeedumériserer pour numériser des textes dactylographiés sur plusieurs pages.
    Je trouverai commode que:
    – la recherche du scanner ne soit faite qu’une fois, au lancement du programme, plutôt que de la recommencer à chaque page.
    – qu’il y ait une commande “Numériser” en haut de page.
    – que le texte soit retourné quand il est mis à l’envers sur le scanner, ce qui est inévitable si le texte scanné est une page de cahier.
    Merci de votre attention

  8. Hello, thanks for the good work!

    This program has lots of parts clearly, from the unpaper treatment, text identification, OCR, etc… But it is hard to know if we can contribute or not if that is not completely exposed. A short exposition about that would be nice, or at least the exposition of these intermediary steps through ocrfeeder-cli would be nice already.

    And I won’t say that the Gnome Project page is… hard to get into and not very inviting, but consider putting it in some place like github or gitlab. The bug tracker will probably make it easier for the community to get involved and even help you with basic problems.

    Obrigado!

  9. Hello I have ocrfeeder 0.8.1 and ocrfeeder can not find my old USB scanner (Agfa SnapScan 1212 USB) ; see my previous message on December the 22.

    When I start ocrfeeder in a terminal and try to scan, I have the following error :
    “Traceback (most recent call last):
    File “/usr/local/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py”, line 199, in __obtainScannersFinishedCb
    devices)
    File “/usr/local/lib/python2.7/dist-packages/ocrfeeder/studio/widgetPresenter.py”, line 2033, in __init__
    self.vbox.pack_start(self.label, padding = 5)
    TypeError: pack_start() takes exactly 5 non-keyword arguments (2 given).

    Any idea ?
    Thank for your help

  10. Mmm, I’m really interest in this program but I can’t install it! I don’t understand the instructions. Can someone say me the commands I have to put on the Terminal? I’m so newbie in this.

  11. Hi Luis,

    If you are on a popular distro, like Ubuntu, you should be able to install OCRFeeder from the software center application without the need for the command like. If you really need to do it from the command line, in that case, and on Ubuntu again, you can do: sudo apt-get install ocrfeeder

  12. Doesn’t handle unicode properly

    File “/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py”, line 284, in exportToOdt
    self.exportToFormat(‘ODT’, ‘ODT’)
    File “/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py”, line 281, in exportToFormat
    name)
    File “/usr/lib/python2.7/dist-packages/ocrfeeder/studio/widgetModeler.py”, line 605, in exportPagesWithGenerator
    document_generator.addPage(page)
    File “/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py”, line 293, in addPage
    self.addBoxes(page_data.data_boxes)
    File “/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py”, line 78, in addBoxes
    self.addBox(data_box)
    File “/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py”, line 66, in addBox
    self.addText(data_box)
    File “/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py”, line 251, in addText
    text = data_box.getText().decode(‘utf-8’)
    File “/usr/lib/python2.7/encodings/utf_8.py”, line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
    UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\u2018′ in position 36: ordinal not in range(128)