OCRFeeder 0.7.6 and DesktopSummit 2011

Just in time for the Desktop Summit 2011, I’ve released the 0.7.6 version of OCRFeeder.

The new interesting stuff in this version is that OCRFeeder can now export to PDF. When exporting the pages to PDF, users will have two choices: “a PDF from scratch” or “a searchable PDF”. The PDF from scratch means that the text part of what will be exported will be written in the PDF using ReportLab whereas the searchable PDF means that the PDF will present the whole original picture but with invisible text overlaid in order to make it searchable.
The PDF exportation still needs some polishing but I wanted to get it out there as soon as possible for the people who need it.
Check out these examples:

OCRFeeder
(page loaded in OCRFeeder and recognized automatically)

OCRFeeder's exported PDF from scratch
(exported PDF from scratch)

OCRFeeder's exported searchable PDF
(exported searchable PDF with selected text)

This version also fixes issues when recognizing grayscale pictures as well as the mouse cursor that was being changed when it was over a page’s right margin.

I’ve also added separators to divide the Document’s submenus so they are grouped correctly and I’ve made ODT the first choice in the list of exportation formats, which had been mistakenly changed.

As usual, the incredible team of translators is doing a great job and apart from the updated translations, OCRFeeder now comes in Catalan (with the Valencian option as well) and in Greek.

DesktopSummit

No, once again, OCRFeeder’s talk wasn’t approved by the Desktop Summit’s organization. If you think that I’ve presented it some well known conferences (LinuxTag, GUADEC ES and twice in FOSDEM), it makes me a bit sad that I couldn’t yet present this unique project in the conference of the desktop it targets, but let’s hope it makes it next year.

Still, Igalia is sponsoring me again to attend the DesktopSummit, so, if you’re interested in OCRFeeder or other projects I’m involved, let me know!

See you in Berlin!

5 thoughts on “OCRFeeder 0.7.6 and DesktopSummit 2011

  1. Since tesseract, gocr, python, pygtk, python enchant, pil, pygoocanvas, and ghostscript all work well on Windows, how difficult do you think it would be to make Windows builds? Is unpaper the main obstacle?

  2. Hey Dan,

    I don’t know how much it would take for it to run on Windows. I don’t run Windows on my computer so if you can check this yourself, I’m okay with integrating patches that make OCRFeeder more neutral in order to make it easier to use in other OSs.

    Cheers,

  3. I just want to say thank you for this great program and the version – works great for me!
    Thank you very much

  4. Thanks for the great open source tool!
    I’ve been trying to export to pdf like mentioned in this day’s blog entry, but I can’t really find the pdf-output command. All I found is “Save As” and specify .pdf as an extension (this won’t create pdf, maybe apparently), or “Export” (both in “File”) where I only find HTML, ODT and Plain Text as export options. That said, even though on OCRFeeder texts are recognized, I can’t generate a pdf file that has text embbedded. I’m using Ubuntu 11.10, OCRFeeder 0.7.5
    Any idea? Thanks again!