Show openSUSE:Factory:Rebuild / tesseract-ocr

Overview Repositories Revisions Requests Users Attributes Meta

Tesseract Open Source OCR Engine

Tesseract is a free optical character recognition engine originally developed at Hewlett-Packard and currently developed by Google. It is a raw OCR engine - it has no document layout analysis, no output formatting, and no graphical user interface. It only processes a TIFF or BMP image of a single column and creates text from it. It can detect fixed pitch vs proportional text. The engine was in the top 3 in terms of character accuracy in 1995. The source code will read a binary, grey or color image and output text.

Tesseract can process English, French, Italian, German, Spanish, Brazilian, Portuguese and Dutch and can be trained to work in other languages as well.

Developed at Publishing
Sources inherited from project openSUSE:Factory
2 derived packages
Derived Packages
Publishing

home:stupidone
Cancel
Download package
Checkout Package
osc -A https://api.opensuse.org checkout openSUSE:Factory:Rebuild/tesseract-ocr && cd $_
Create Badge

Build Results
RPM Lint

Refresh

Source Files

Filename	Size	Changed
tesseract-ocr-3.05.00.tar.gz	0003581853 3.42 MB	over 7 years ago
tesseract-ocr.changes	0000007527 7.35 KB	over 7 years ago
tesseract-ocr.spec	0000003957 3.86 KB	over 7 years ago

Revision 5 (latest revision is 16)

Dominique Leuenberger (dimstar_suse) accepted request 458814 from

Ismail Dönmez (namtrac) over 7 years ago (revision 5)

### Depends on sr#458696 ###

- Update to 3.05.00
  * Made some fine tuning to the hOCR output.
  * Added TSV as another optional output format.
  * Fixed ABI break introduced in 3.04.00 with the AnalyseLayout()
    method.
  * text2image tool - Enable all OpenType ligatures available in
    a font. This feature requires Pango 1.38 or newer.
  * Training tools - Replaced asserts with tprintf() and exit(1).
  * Improved multipage tiff processing.
  * Improved the embedded pdf font (pdf.ttf).
  * Enable selection of OCR engine mode from command line.
  * Changed tesseract command line parameter '-psm' to '--psm'.
  * Added new C API for orientation and script detection, removed
    the old one.
  * Fixed many compiler warning.
  * Fixed memory and resource leaks.

Places

Actions on this page

Tesseract Open Source OCR Engine

Edit Package tesseract-ocr

Source Files

Revision 5 (latest revision is 16)

Comments 0

Places