OCRFeeder goes public!

Finally, the first initial commit to a public SVN of my new project — OCRFeeder.

OCRFeeder is an Optical Character Recognition and Document Analysis and Recognition program for GNU/Linux.
It features a complete graphical user interface in GTK but can also be used from the command line for automation purposes.

It is written in Python and was developed as the project for my Master’s Thesis in Computer Science Engineering.

So go on and checkout the project’s source:

  svn checkout http://ocrfeeder.googlecode.com/svn/trunk/ ocrfeeder-read-only

Note this is only an SVN release yet so I get some feedback and the traditional first bug reports.
You can also be part of this project as a developer or a translator, just drop me an email.

I hope this is a good step on the evolution of OCR technologies in GNU/Linux system.

Soon I’ll be adding here a list of features you can find as well as a screencast.


My Master Thesis

Its being almost a month since my last post… so, lets catch up a little.

On the last February 19th I drove down from Galicia to Portugal, it was quite a boring trip of more than 7 hours. Luckily I had my girlfriend right on my side and the iPod’s battery honored its fame and soundtracked the whole trip.

I went to Portugal because on the next day, February 20th, I finally presented my Master Thesis in Computer Science Engineering!
Yeah! A little more than a year after I went to Seville and about 8 months since returned to Portugal, I finally presented it and culminated my Master of Science degree.

The thesis was about the developing of an OCR suite for GNU/Linux, based on some ideas I had before. I started developing it the when I returned from Seville and finished it on October (had the luck that the deadlines got extended and wouldn’t need to deliver it before September), then it took me until the mid of December to finish writing the thesis and (final tests of the program included) — I delivered it the 15th of December. Thanks to the bureaucratic services at my University, the sooner the thesis presentation could be arranged was the mentioned February 20th… But hey! Now it is done!

About the OCR program, it is written in Python featuring a GUI powered by PyGTK and can use several Open Source OCR engines to perform OCR. It allows user correction/edition of the results, etc. and generates ODT or HTML file. You can also use it from the CLI in case you want to automate some tasks or link it with other apps.

I am releasing the program soon as GPL, so stay tuned.

I’d really like to thank a lot to all the people that supported me all the time and keep supporting:
Mom, Dead,  Bro, Girlfriend, Professor Luís Arriaga, and friends such as Luís Rodrigues and Pedro Salgueiro.

PS: My absence in the www world outside of work due to the fact that I’m internetless since I came to A Coruña, *hopefully* next week the ISP I chose will turn the switch of information in my flat and I’ll be connected once again to the world. Then I’ll post what’s happened in my world of GTK, Igalia and Django.