Show home:Tomcat42 / python-pdfminer.six

Overview Repositories Revisions Requests Users Attributes Meta

PDF parser and analyzer

https://github.com/pdfminer/pdfminer.six

Fork of PDFMiner using six for Python3 compatibility.

PDFMiner is a tool for extracting information from PDF documents.
Unlike other PDF-related tools, it focuses entirely on getting
and analyzing text data. PDFMiner allows to obtain the exact
location of texts in a page, as well as other information such
as fonts or lines. It includes a PDF converter that can transform
PDF files into other text formats (such as HTML). It has an
extensible PDF parser that can be used for other purposes instead
of text analysis.

Sources inherited from project devel:languages:python
Devel package for openSUSE:Factory
6 derived packages
Derived Packages
home:mnhauke

home:jayvdb:branches:devel:languages:python

home:yarunachalam:bran...evel:languages:python

home:Simmphonie:python310

home:fstrba

home:ecsos:python
Cancel
Links to openSUSE:Factory / python-pdfminer.six
Download package
Checkout Package
osc -A https://api.opensuse.org checkout home:Tomcat42/python-pdfminer.six && cd $_
Create Badge

Build Results
RPM Lint

Refresh

Source Files

Filename	Size	Changed
_link	0000000124 124 Bytes	9 months ago
import-from-non-pythonpath-files.patch	0000001452 1.42 KB	6 months ago
pdfminer.six-20221105.tar.gz	0010857730 10.4 MB	7 months ago
python-pdfminer.six.changes	0000003790 3.7 KB	6 months ago
python-pdfminer.six.spec	0000002987 2.92 KB	6 months ago

Revision 12 (latest revision is 15)

Martin Hauke (mnhauke) accepted request 1132937 from

Jonathan Papineau (jonapap) 6 months ago (revision 12)

- Update to 20221105
  - Option to disable boxes flow layout analysis when using pdf2txt 
  - Add support for PDF 2.0 (ISO 32000-2) AES-256 encryption
  - Support for Paeth PNG filter compression (predictor value = 4)
  - Type annotations
  - Export type annotations from pypi package per PEP561
  - Support for identity cmap's
  - Add support for PDF page labels
  - Installation of Pillow as an optional extra dependency
  - Exporting images without any specific encoding 
  - Output converter for the hOCR format
  - Font name aliases for Arial, Courier New and Times New Roman
  - Documentation on why special characters can sometimes not be extracted
- Remove patch python-pdfminer.six-remove-nose.patch
- Update dependencies

Places

PDF parser and analyzer

Edit Package python-pdfminer.six

Source Files

Revision 12 (latest revision is 15)

Comments 0

Places