OCRFeeder 0.7.11 released

Here is 2013’s first version of OCRFeeder, version 0.7.11.

For this version, a number of bugs were fixed, especially some that were affecting saving and loading projects.
Some small improvements were also made such as being able to load multiple images at once and being able to choose the OCR engine from the command line interface version of OCRFeeder (using the -e option).

Now for the main feature, I developed something that had been requested by a good number of users: being able to easily choose the language for the OCR engine.
When I developed OCRFeeder, I wanted to make it easy for users to use system-wide OCR engines from the layout analysis that OCRFeeder performs but I also wanted it to remain powerful and that’s why the engines are configured in a general, abstract way, as if from the command line.
Some OCR engines support setting the language in order to get a better recognition and while, users could already set the language of an engine manually using the OCR editor dialog, they wanted to have a nice drop-down list with the languages instead.
This represented a real challenge: to keep the old and flexible configuration and, at the same time, offer a high-level way of choosing the language.

OCRFeeder's new configuration
So here is how it works. There is a new special argument keyword $LANG that will be replaced by the new field “language argument” and the currently set language. Since engines support different languages (or none) and call them different names (e.g. Tesseract expects “por” for the Portuguese, others may expect “pt”) there is another new field called “languages” which should be a map between the language code in the ISO 639-1 and the name of the language of the engine expects, as shown in the screenshot.

Languages combo
To show the languages, there is a new tab in the areas’ editor called Misc (in lack of a better name for a tab that’s holding more stuff in the future) with the languages combo. This combo shows a check on the languages that the currently selected engine recognizes as seen in the screenshot.

There is also a new setting in the preferences dialog with the default language and the first time the application runs, it will assign it to the user’s locale.
One thing must be taken into account: even though Tesseract supports an extensive list of languages, the users must have those packages installed in their distros, otherwise, recognition will of course fail.

To finish, related to my recent job search, I have spent this week in San Francisco getting to know some people from an exciting start-up and despite the jet lag, I managed to finish this release so I can now say that least part of OCRfeeder was designed and developed in California 😛

Source tarball
Git
Bugzilla

Playing Angry Birds with a Kinect

Recently I had to use OpenCV in a project inside the Igalia Interactivity and I took the chance to code a little demo I had in my mind for a while: play Angry Birds with a Kinect and using only Free Software.

Here’s the result:


(direct link to video in Vimeo)

How it’s done

The demo uses Skeltrack (the only Free Software skeleton tracking library) to get user’s hands’ positions. The picked hand’s position will be used to move the mouse pointer. This part is the same that is used in the Skeltrack Desktop Control demo.
Once the hand’s position is known, I calculate an area around its point in the original depth image given by GFreenect and then use OpenCV to get the hand’s contours and their convexity defects. An open hand palm will produce very distinguished convexity defects and by counting them I assume the user’s hand palm is open. After this, all that is missing is to tie a detected closed palm to a mouse press and an open palm to a mouse release.

As the video shows, the demo is not polished yet but it shows one of the many possibilities that allying Skeltrack with other computer vision software gives us.

Skeltrack 0.1.10 is out

That’s right, a new version of the world’s first Free Software skeleton tracking library is out.
In every version we try to make Skeltrack more robust and this one is no exception.

Head&Shoulders

We have changed the way the shoulders are inferred. This heuristic now uses a circumference around the user’s head and an arc with which it searches for the shoulders.
Since we like to keep giving developers the ability to tweak the algorithm’s parameters, we had to change the properties related to the shoulders. We should probably improve the documentation with a visual explanation of how those properties work but meanwhile you can check the properties’ documentation.

Centering Joints

Another issue we had was that the extremas we initially calculate result in e.g. the point at tip of the a finger (for a hand joint) or the top of the head. This was not an issue specifically but it might result in more unstable joints. For example, the Kinect device in particular might give blind spots in very bushy hair which would result in the head joint jittering more than usual.
To fix this, we calculate the average of points around an extrema and assign it with that value. The radius of the sphere surrounding an extrema that is used to calculate this average can be controlled by using the extrema-sphere-radius property. Thus, if this behavior is not desired, this feature can be turned off just by simply assigning a 0 to this property.

Here is a couple of pictures describing this issue:

Picture of Skeltrack's test without averaged extremas

Without the averaged extremas (extrema-sphere-radius set to 0)

Picture of Skeltrack's test with averaged extremas

With the averaged extremas (extrema-sphere-radius set to 300)

Vertical Kinect

Due to a project that Igalia Interactivity has been working on, we had to use the Kinect in a vertical stance. By doing this we discovered a small bug that prevented Skeltrack to be used with a vertical depth image. This is corrected in this 0.1.10 version and while fixing it, we found out that it seems the other skeleton tracking alternatives also do not support the Kinect in a vertical stance; this might mean that if you want to use skeleton tracking with the Kinect vertically, your only choice is either to use Skeltrack or to convince Microsoft or PrimeSense to fix their solutions for you 🙂

Picture of Skeltrack's test example using a Kinect in a vertical stance

Skeltrack using a Kinect in a vertical stance

Last but not least, the function skeltrack_skeleton_new was returning a GObject instance by mistake. We have corrected that and it now returns a pointer to SkeltrackSkeleton as expected.

Special thanks to Iago, our intern at the Igalia Interactivity team, for coding most of these nifty features.

Be sure to clone Skeltrack at GitHub and read the docs, you are welcome to participate in its development.

OCRFeeder version 0.7.10

The previous OCRFeeder‘s version was released in April. I have been busy with Skeltrack and other projects but, between my personal time and Igalia‘s precious hackfest time, here we have a new version of the best Free Software OCR application.

For this 0.7.10 version I have improved the way that the document generators (the classes that generate the desired exportation formats) are used inside OCRFeeder. I have abstracted their use making it easy to add new document generators in the future.
The command line version, which has been limited to generating only the original exportation formats (ODT and HTML), also benefits from these changes; from this version on, it is possible to generate documents with any of the existing exportation format from the command line. For example, to generate a plain text file:

$ ocrfeeder-cli -i scan1.ppm -i scan2.jpeg -f TXT -o text_doc.txt

The current PDF exportation still has flaws that will take some time to fix but for now I have fixed a big issue: line wrap. The text lines would not wrap when written in the PDF document and so, long lines would go beyond the pages’ limits. This should be improved with this new version and I hope I have the time in the future to fix the other issues.

Moving (or swapping) pages by dragging them seems to have stopped working. This seems like a PyGTK bug but anyway it was the necessary excuse to implement actions for selecting and moving the pages using the menu or shortcuts. This will make the mentioned bug less important and also offers the possibility of moving pages easily to visually impaired users.

Screenshot of the select or move pages menus

Future

I want to fix some issues in OCRFeeder’s architecture, especially in what comes to the UI part. This should probably be done together with a port the amazing GObject’s Introspection.
Jan Losinski, from TU Dresden, was kind enough to send me some patches that make the OCRFeeder’s recognition parallel. This feature needs to be polished but it will likely land in the next version of OCRFeeder.
Last but not least, I need to check how to make it easy to integrate the user’s language in the OCR recognition. I exchanged some emails with the people from AltLinux distro who seem to have already implemented this in their repositories but I need time to try and review their patches.

Contribute

If you want to contribute and make this project better, fear not! The code is all Python and I’m available to help you get started so email me if you’re interested.

Enjoy OCRFeeder 0.7.10!

Source tarball
Git
Bugzilla

Skeltrack 0.1.8 released

Skeltrack, the Open Source library for skeleton tracking, keeps being improved here in Igalia and today we are releasing version 0.1.8.
Since July we have had the valuable extra help of Iago López who is doing an internship in Igalia’s Interactivity Team.

What’s new

Several bug fixes (including the introspection), both in the library and the supplied example were fixed.
The threading model was simplified and the skeleton tracking implementation was divided in several files for a better organized source code.

While the above is nice, the coolest thing about this release (and kudos to Iago for this) is that it makes Skeltrack work better with scenes where the user is not completely alone. The issue was that if there was another person or object (think chairs, tables, etc. for a real life example) was in the scene they would confuse the skeleton tracking. After this version, while not being perfect (objects/people cannot be touching the user), the algorithm will try to discard objects that are not the user.
But what about having two people in a scene, which one will it choose? To control this, we have introduced a new function:

skeltrack_skeleton_set_focus_point (SkeltrackSkeleton *skeleton,
gint x,
gint y,
gint z)

This function will tell Skeltrack to focus on the user closer to this point, thus allowing to focus on a user in real time by constantly changing this point to, for example, the user’s head position.
So, even if there is no multi-user support, the current API makes it easy to just run other instances of Skeltrack and try to pick users from other points in the scene.
It should also be easier to use Skeltrack for a typical installation where there is a user controlling something in a public space while other people are passing or standing by.

Contribute

We will keep betting on this great library.
If you wanna help us, read the docs, check out Skeltrack’s GitHub and send us patches or open issues.