Show home:Tomcat42 / python-pdfminer.six

Overview Repositories Revisions Requests Users Attributes Meta

PDF parser and analyzer

https://github.com/pdfminer/pdfminer.six

Fork of PDFMiner using six for Python3 compatibility.

PDFMiner is a tool for extracting information from PDF documents.
Unlike other PDF-related tools, it focuses entirely on getting
and analyzing text data. PDFMiner allows to obtain the exact
location of texts in a page, as well as other information such
as fonts or lines. It includes a PDF converter that can transform
PDF files into other text formats (such as HTML). It has an
extensible PDF parser that can be used for other purposes instead
of text analysis.

Sources inherited from project devel:languages:python
Devel package for openSUSE:Factory
6 derived packages
Derived Packages
home:mnhauke

home:jayvdb:branches:devel:languages:python

home:yarunachalam:bran...evel:languages:python

home:Simmphonie:python310

home:fstrba

home:ecsos:python
Cancel
Links to openSUSE:Factory / python-pdfminer.six
Download package
Checkout Package
osc -A https://api.opensuse.org checkout home:Tomcat42/python-pdfminer.six && cd $_
Create Badge

Build Results
RPM Lint

Refresh

Source Files

Filename	Size	Changed
_link	0000000124 124 Bytes	about 4 years ago
pdfminer.six-20200726.tar.gz	0010260419 9.79 MB	over 3 years ago
python-pdfminer.six-remove-nose.patch	0000036260 35.4 KB	over 3 years ago
python-pdfminer.six.changes	0000002273 2.22 KB	over 3 years ago
python-pdfminer.six.spec	0000003428 3.35 KB	over 3 years ago

Revision 5 (latest revision is 15)

Tomáš Chvátal (scarabeus_iv) accepted request 833056 from

Petr Gajdos (pgajdos) over 3 years ago (revision 5)

- version update to 20200726
  - Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change 
  - Always try to get CMap, not only for identity encodings 
  - Support for painting multiple rectangles at once 
  - Validate image object in do_EI is a PDFStream 
  - Hiding fallback xref by default from dumppdf.py output 
  - Raise a warning instead of an error when extracting text from a non-extractable PDF 
  - Switched from pycryptodome to cryptography package for AES decryption 
  - Python3 shebang line to script in tools 
  - Fix ordering of textlines within a textbox when `boxes_flow=None` 
  - Allow boxes_flow LAParam to be passed as None, validate the input, and update documentation 
  - Also accept file-like objects in high level functions `extract_text` and `extract_pages` 
  - Text no longer comes in reverse order when advanced layout analysis is disabled 
  - Updated misleading documentation for `word_margin` and `char_margin` 
  - Ignore ValueError when converting font encoding differences 
  - Grouping of text lines outside of parent container bounding box 
  - Group text lines if they are centered 
  - Python3 shebang line to script in tools 
  - Fix ordering of textlines within a textbox when `boxes_flow=None` 
- do not require nose for testing
- added patches
  fix https://github.com/pdfminer/pdfminer.six/pull/489
  + python-pdfminer.six-remove-nose.patch

Places

PDF parser and analyzer

Edit Package python-pdfminer.six

Source Files

Revision 5 (latest revision is 15)

Comments 0

Places