Grilo is getting really interesting and one of its newest nice things is the DBUS interface Juan has been working on lately.
This DBUS interface is currently known as Rygel-Grilo (it was originally intended to be a source for Rygel) and uses the MediaServerSpec to allow developers to retrieve the media objects Grilo provides.
Since there aren’t still Python bindings for Grilo, I decided to use the Rygel-Grilo to be able to use Grilo from Python.
So I developed a Rhythmbox plugin that shows every MediaServer1 object available and lets the use browse through the contents of these. Needless to say, although this plugin provides a very generic basic and usage, it’s easy to see how applications like Rhythmbox could be using Grilo to get their media. The philosophy is: Grilo gives you content, GStreamer plays that content, and you’re free to focus in the rest of your app’s details.
Here’s a video of Rygel-Grilo and the Rhythmbox MediaServer1 plugin in action:
Juan did also developed a cool plugin for Totem similar to this one. Take a look at this post to see the plugin working and a more detailed explanation of what Rygel-Grilo is.
Vimeo is one of the main video sharing places in the web and I thought it would be useful to develop a Grilo plugin to search videos on it.
Yesterday Juan committed the code which means you should now be able to easily search videos from Vimeo and watch them in your desktop using Grilo’s test UI. Here are a couple of screenshots:
I really like the way Grilo is going. Together with GStreamer, the effort needed to create and media player with sources such as your hard drive, YoutTube, Vimeo, Flickr, etc. is just minimum.
I was disappointed with the text completion provided by the N900 (eZiText) that, on top of that, is closed and I wondered if it was possible to have an Open Source solution to provide text prediction and completion.
I searched a bit and besides my original intentions of developing a library to search Free and Open Source dictionaries’ words from a prefix, I found Presage.
Presage is better than most text prediction systems I have seen out there because it really is text prediction, not text completion. This C++ library, retrieves words taking into account the surrounding text, not only the prefix or frequency of words. It uses a database representing N-grams that can be trained with more text; the more you train it, the more accurate it can be.
This means that is you type something like: “I m”
instead of suggesting nonsense things like: “I mouse” “I mother” “I market” or “I more“
it suggests something more like: “I must” “I met” “I mean” or “I might“
The difference is obvious!
So I developed a little wrapper around Presage in C that provides a yet very basic API to get text completion. Then I created a GTK+ Input Method context to control the user’s input in regular GTK+ text widgets and used the wrapper to process the inputted text. I called it: Predictor Input Method (not very original I know…).
The result is that Predictor suggests you words, even if you type a prefix or not, and lets you accept the candidate word or scroll through a list of suggestions as you can see in the video below:
Ctrl+Enter -> Selects the current candidate Ctrl+Up/Down -> Scrolls through the list of candidates Backspace -> Deletes the character previous to the cursor and suggests again Directional arrows -> Move cursor and discard suggestions
Who should use it
This kind of assistance technology can have many applications but the main ones are: the usage in small/mobile devices and the assistance of users with disabilities. Both have the same reasons behind: speeding the input and reducing failed characters, because the input required gets minimized;
Of course, you can as well use it in your GNOME desktop regularly for faster typing your emails, etc.
In the case of users with disabilities, a popup menu could be added to show a complete list of candidates and the bound fast-access keys.
Why is Free Software important in this
This is the kind of technology that everybody should have an interest in using a FOSS solution because of the obvious advantage that is developers from all over the world being able to modify it.
Suppose you’re creating a mobile phone and you choose a closed solution to provide text prediction for your phone. And then you find out you’re disappointing all your users from country X because that library you’re paying for does not support their language and the library owner is not interested that much in adding it. Now if you’re using an open solution, local communities from many places in the world can add support for their languages and your phone can have a better acceptance in places you hadn’t even imagined.
Software that reaches an international audience with different languages is software you want to have open.
If you are not using GTK+ Input Methods then you can use the wrapper text-predictor.cpp which is not tight to the Input Method code itself. And of course, you can copy the little tricks used on the Input Method code and apply it to your source (like delaying the retrieval of the candidates some fractions of a second to not block the input, etc.).
It’s been a while since I wrote my last post but I guess this one will compensate.
When I posted about how I made OCRFeeder run in Fremantle I said I wasn’t thinking of porting the application but in later talks with some people, it was clear that OCRFeeder might come in handy for some people.
One of the use cases that we have talked about was to be able to create a contact in the address book by recognizing the contact fields from a business card.
So, for some days in these last weeks, I’ve been porting OCRFeeder to Fremantle!
(The card-to-contact feature is still to come as I wanted to have OCRFeeder “fremantelized” before)
New Respository
I had been using git-svn to develop OCRFeeder and while this was okay when there was just a branch (trunk), with the Maemo version it was clear that Google Code’s SVN repository wasn’t enough. (Yes, I know they have mercurial but I’m git user)
So, yesterday I relocated OCRFeeder’s development to Gitorious where you’ll find the branch “maemo” besides the “master” one: http://gitorious.org/ocrfeeder
Development Notes
I must say that although I had for a long time used PyGTK for my UI code, on Hildon, I am more experienced in using C. While from the theory part this is the same, on the practical side, the PyMaemo bindings had some issues that delayed the development a bit (mainly undocumented functions that differ from the direct and expected usage, as well as some bugs I found).
I must thank Lizardo and other PyMaemo folks who were kind enough to help me every time I bugged them with questions and suggestions.
I think OCRFeeder for Maemo represents another example of how a desktop targetted application can be ported to Fremantle, specially from the design point of view. The chats I had with my friend and colleague Felipe (who, by the way, has just become a Master degree student in a in User-Centered Interactive Technologies) surelly helped in this matter.
Trying OCRFeeder for Maemo
Now, you can try to use OCRFeeder but you’ll have to first compile and install pygoocanvas and Tesseract or another OCR engine like I wrote here. I hope I have time to create deb packages for both pygoocanvas and Tesseract as they’re also very useful apps to have.
As a final note, I must say that although everything was working fine on Maemo 5.0 SDK beta 2, today the final SDK was released and I tested OCRFeeder on it… and not everything works well as before. The problems are mainly related to GtkTreeViews (Hildon style) which, from the C side seem to be working okay, but from the PyMaemo side seems not to obey the selection mode I assign to it.