Here is 2013’s first version of OCRFeeder, version 0.7.11.
For this version, a number of bugs were fixed, especially some that were affecting saving and loading projects.
Some small improvements were also made such as being able to load multiple images at once and being able to choose the OCR engine from the command line interface version of OCRFeeder (using the -e option).
Now for the main feature, I developed something that had been requested by a good number of users: being able to easily choose the language for the OCR engine.
When I developed OCRFeeder, I wanted to make it easy for users to use system-wide OCR engines from the layout analysis that OCRFeeder performs but I also wanted it to remain powerful and that’s why the engines are configured in a general, abstract way, as if from the command line.
Some OCR engines support setting the language in order to get a better recognition and while, users could already set the language of an engine manually using the OCR editor dialog, they wanted to have a nice drop-down list with the languages instead.
This represented a real challenge: to keep the old and flexible configuration and, at the same time, offer a high-level way of choosing the language.
So here is how it works. There is a new special argument keyword $LANG that will be replaced by the new field “language argument” and the currently set language. Since engines support different languages (or none) and call them different names (e.g. Tesseract expects “por” for the Portuguese, others may expect “pt”) there is another new field called “languages” which should be a map between the language code in the ISO 639-1 and the name of the language of the engine expects, as shown in the screenshot.
To show the languages, there is a new tab in the areas’ editor called Misc (in lack of a better name for a tab that’s holding more stuff in the future) with the languages combo. This combo shows a check on the languages that the currently selected engine recognizes as seen in the screenshot.
There is also a new setting in the preferences dialog with the default language and the first time the application runs, it will assign it to the user’s locale.
One thing must be taken into account: even though Tesseract supports an extensive list of languages, the users must have those packages installed in their distros, otherwise, recognition will of course fail.
To finish, related to my recent job search, I have spent this week in San Francisco getting to know some people from an exciting start-up and despite the jet lag, I managed to finish this release so I can now say that least part of OCRfeeder was designed and developed in California 😛
23 thoughts on “OCRFeeder 0.7.11 released”
Cheers Joachim, I always wanted to have a look at ocrfeeder but my needs for OCR have gone down in recent years and I never got the courage to try it out.
Anyhow, my 2¢ of UI/usability advice for your new UI: instead of checkboxes inside comboboxes (o_o), have you considered simply using a listview widget with two columns (one for the checkbox, one for the language name) and, if you need some sort of priority system, let that listview’s rows be reorderable with drag and drop?
(Oh and I meant Joaquim, if you want to edit my previous comment, sorry about that 🙂
The checkboxes in the combobox cannot really be checked (they are disabled). They are only used to indicate that the currently selected engine supports that language.
Probably not the most easy way to do it but I couldn’t figure out a much better solution since I want to show all languages (because I intend on using them to set metadata for the document’s area).
Well, if it’s not meant to be interacted with, you could simply use unicode characters in front of the language strings: ✓ vs ✗.
Right, I could, but programmatically it would have been much uglier.
I know the current solution is not a good one but I couldn’t come up with a better one.
I am very looking forward to try this yours new version 0.7.11. I have a 0.7.9 version on Ubuntu 12.04 LTS, but it do not work well for my needs. I have a grandfathers memories (scan in png file) which was typed on mechanical typewriter in czech language and OCRFeeder 0.7.9 did not recognize text in all. So I hope, that 0.7.11 will works better in recognizing with scans of text typed on typewriters.
(sorry, my english is not good.)
I hope this new version will allow you to convert your grandpa’s memories. That’s a very interesting use case I would have never thought of so you just made my day 🙂
Hi Joaquim, I want to thank you about this awesome software, it helped me out several times, it’s so, so useful.
I would like to ask you when OCRFeeder 0.7.11 will be on Ubuntu 12.04 repositories. Is this something you have to do or it doesn’t depends on you? Anyway I would love to use this version on my system so if you could show us how to install it, I would be very pleased.
Thank you again for keepeng up to date this amazing software 😉
Oh, I followed the steps in INSTALL file inside the .tar.xz file I, after installing some dependences, I could finally get 0.7.11 version on Ubuntu 12.04 without any problem.
Thak you! 😀
I am sorry I couldn’t reply to you in time but I am glad you could figure it out for yourself.
I am not responsible for OCRFeeder’s package on Ubuntu so I don’t know how often it is updated.
This version is a real improvement with an odt export which really work (I was a little bit disappointed with the previous one). I was waiting a good OCR tool in gnome for years, it now exists, thank you for that. Some tests with my pdf reader (evince) seams to locate words randomly in an indexed page. Did you encounter problems with the dejavu open format I think I saw in the previous version ?
Again, thank you !
I’m glad you like this new version. I never had djvu support though.
Is that something you think is of general interest?
Djvu format is really more efficient than pdf about the ratio (picture quality / document size). It has the hidden text fonctionnality users are looking for in results produced by a good OCR. That’s the reason why you can find it in ABBYY’s products and gscan2pdf for examples. So that’s a good second choice to implement after PDF which is more widly used.
A little suggestion : it miss an action to close the current project. Using it would present the same clean page than when you start ocrfeeder, ready to create a new page on a new project.
(sorry for my poor english)
I think that’s a reasonable thing to have. I will do it for the next version.
Hi Joaquim! Thanks for all your hard work on this program. I would like to try it, but before I do I want to be sure that it does what I think it does…
I am in need of a smarter OCR that will be able to differentiate between line breaks and paragraph breaks. I have been using the OCR feature on Acrobat which, so far, seems to be the best, but leaves hard returns at the end of every line and does not differentiate between paragraphs and line endings.
Will your “fix line breaks and hyphenization” feature be able to recognize the difference between the line endings and paragraph endings?
I am excited about using this as an opportunity to experiment with Linux, but would rather know ahead of time if this has the feature I am looking for before I dive into all that! Thank you much!
Hi Adrienne, I am not sure that option you refer will do what you need. It doesn’t treat paragraphs breaks specially, just looks for a newline which doesn’t have another newline before and removes it so you should give it a try. Sorry my answer couldn’t not be very helpful…
Joaquim, Thanks for getting back to me! I will definitely give it a go when I have some time to play around. Keep up the good work!
Traceback (most recent call last):
File “/usr/bin/ocrfeeder”, line 31, in
from ocrfeeder.studio.studioBuilder import Studio
File “/usr/lib/python2.7/site-packages/ocrfeeder/studio/studioBuilder.py”, line 21, in
from ocrfeeder.util import lib
File “/usr/lib/python2.7/site-packages/ocrfeeder/util/lib.py”, line 31, in
from lxml import etree
ImportError: No module named lxml
with 0.7.11-1.1.noarch on F18, had to install python-lxml. I suggest you to add it to deps in spec file. Anyway, thanks for OCRFeeder! (Disregard this comment from 0,7.10 blogpost, should be placed here; meaculpa.)
Thanks for a great program. Couple of questions/suggestions:
– what can I do (or what can be done) about decreasing OCRFeeder image loading time, before I start the actual OCR? Should I somehow pre-process the image and what image format is preferable? E.g. I have some scanned legal documents (jpg, 1,3 MB, ~2400×3500) to OCR, and it takes a while to process them internally.
– Also while loading multiple images (like the above ones) at once into OCRFeeder (by drag&drop from a file manager), I get multiple notification windows concerning each one of these images. There could just one.
– on my kubuntu netbook (screen resolution is 1024×600) I need to force kwin to cut OCRFeeder to 808×600 from 808×628; I can’t resize it manually to fit in my screen. As a result, the above mentioned windows will get just as huge, covering the main window.
– Would it be possible to have the window section containing OCR results configurable to show the OCR results from the entire image and not just a block of text?
– also, in terms of usability, I’d like to have the option to view just the output text in the right section of the OCRFeeder window, without viewing the details of the image (the window’s middle section would suffice).
Thank you and best regards,
1) You can possibly reduce the images’ sizes without harming the recognition too much?
2) I don’t know which notifications you’re talking about, I’ll be happy to take a look at if you send me a screenshot by email.
3) I am sorry about the window’s size, I have a 1440×900 laptop and I always think nobody uses <=1024 anymore... I can review that in the future. 4) I am thinking of having an option to show the full text on a different window; this would be activated by a menu/shortcut though. 5) I can put a "collapser" to hide the image, allowing to save vertical space. Cheers,
Hello Joaquim, thanks for answering!
1) I’d rather not decrease the image quality, due to the need to avoid or reduce my manual “post-processing”. Is the initial “preparation of image” necessary at all? Or, could there be a list of image formats which would require no initial “preparing”?
2) I’m referring to windows popping up after I drop an image on a OCRFeeder main window: in Polish it’s “Przygotowywanie obrazu; Proszę czekać…” (“Preparing the image; Please wait” – I’ll try to send a screenshot);
3) 1024×600 is the max screen resolution on my HP 210 Mini; I’ve been using a larger laptop some time ago, before my successive bags & my palms began to revolt. I need to move constantly during my working day.
4) IMO having a tab showing the output text and the formatting/correction tools would be just as good.
many thanks for your hard work.
I found a bug, here is the report: https://bugs.launchpad.net/ubuntu/+source/ocrfeeder/+bug/1208789
If you use the automatic unpaper OCRFeeder will hang after perparing the Document for ocr. It would be very nice if you have a look… It would me worth a tip or else.
Comments are closed.