Feedback of all kind is welcome, especially ideas on how to improve the OCR quality. It is written in C#/WPF and the full source code is available as ready-to-compile Microsoft Visual Studio 2013 project on GitHub under the GPL V2 open source license. The (a9t9) Free OCR for Windows Desktop tool is a graphical user interface front-end (GUI) for the Tesseract engine. Windows 8 OCR software - our free, open-source (GPL) Windows Store OCR app.īoth new services use a different OCR component and have much better text recognition rates than the Tesseract-based OCR desktop software on this page. OCR API - our free web API**, includes OCR command line examples with cURL.ģ. Online OCR - our free web-based OCR app.Ģ. Still need better text recognition results? Then try these new alternatives:ġ. Dark borders must be manually removed, or they will be misinterpreted as characters.Any rotation or skew must be corrected or no text will be recognized,.Images (especially screenshots) must be scaled up such that the text height is at least 20 pixels.Tesseract’s output will be very poor quality if the input images are not preprocessed to suit it: Its quality varies from language to language - so go ahead and test if it is sufficient for your needs. Behind the scene it uses the Tesseract open-source OCR engine. Unfortunately the conversion quality is not so great. Output text can be saved as a text file or Word document. You start the OCR by clicking the green Start Ocr, and you will see the result in the right window.
If your document has more than one page, or if you opened multi-page documents, use the arrows at the bottom to navigate between them, The content of the source file will be displayed in the left window.
Tesseract ocr download pdf#
The OCR software includes full PDF support (powered by Ghostscript).