A command line tool written in python that reads a pdf/zip file and outputs a text file using tesseract OCR engine. Given an appropriate alias you can run Input and output OCR samples are available at ...
For txt, I let it stay similar format to the msg tool. That means one lang one txt file. For csv, I put all the languages into one file, with the msg entry name, its guid, and attributes. I think this ...