The week that was…

Sprint 0 –

Task – To compile Tesseract 3.01 core along with training and OSD libraries, and get it running on the PC.

Progress –

After removing some trivial errors from the VS2010 solution, I hit a roadblock with with three link errors, LNK2001 and LNK2019 type errors to be precise. Trying to debug them let me to a roadblock since from my research on Tesseract’s documentation, all the references seemed to be in place. The sad part for me was that, this suggested something might be wrong with the code and I was nowhere close to being thrilled to open up a 20 project code to find a needle in a haystack. But, as luck would have it(after 4 days of trying to understand and modify the actual code) I finally found the problem. The problem was with the  version control from the repository i downloaded. Some of the files were taken in from a later bug release version whereas some of the files were from an earlier version. So I had to sort of do my version control, and download the projects that I that were out of date and reference them with the previous files. I got help online from a github repository of a guy named ‘tinkku’, since he had sort of a similar problem and had written some bug fixes in the vs version and configure files. THis in turn made my job a lot easier. So finally after some initial struggle, build succeeded and all libraries started to work!

Next task, to build the dlls out, and check them for functionality, which in turn proved to be relatively easier compared to the previous task.

My last 2 days in this sprint were spent trying to set up the Andriod NDK and get the Tesseract build for android using a cygwin shell on my PC (Thanks to Salman for this!). After 2 days of struggling with the NDK, it finally built and Tesseract’s android libraries were up and running! But the next roadblock seemed to be making an app out of it. The android SDK is giving me some very unexpected issues, hopefully just a trivial error. Next sprint week.. get the app up and running on Android with our basic UI!

 

And…Kick off!

Our Final year project is finally here and life can’t get any more interesting (and strenuous)! With a team of visionaries comprising of Varun, Salman, Aranab and myself, we embark on an epic quest.. the goal being to not only develop an app, but a complete framework in the world of Optical Character Recognition (OCR). With very little prior experience in this field, we hope to embark on this long arduous journey to build a product that is intuitive and actually usable in real life , and not just another FYP.

I’m Aravindh, a final year Engineering student from the National University of Singapore, and the primary reason behind this blog is to log our problems and success stories along our development road map, to make this path less stony to anyone passing through the same way. We have decided to go with the open source software Tesseract 3.01, as our OCR engine, and build our own UI over it. We are planning to build a WPF app for a windows PC, an Android app for the phone and a cloud storage powered by an online website to facilitate in the syncing between devices and to back up the data. The app would be designed to take photos/upload pictures of contact cards and documents, which would then be passed through an OCR engine and sent to appropriate storage avenues.. the contact card being sent to the contacts on the phone, the documents made editable, etc. This would then be synced to the cloud and to other devices where the user has an active account. The cloud would make it possible for the user to access his/her data from any machine on any machine, irrespective of whether it runs our app or not. Further plans include porting to the Windows 8 ecosystem, improving the OCR core to recognize magazines, books,etc. and porting our app across other platforms like iOS and WP8.

Cheers to us and to anyone out there trying similar stuff! Let us try and change the way computers interact with pictures!

– Aravindh