Revisions of python-charset-normalizer
buildservice-autocommit
accepted
request 1128743
from
Dirk Mueller (dirkmueller)
(revision 44)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 43)
- update to 3.3.2: * Unintentional memory usage regression when using large payload that match several encoding (#376) * Regression on some detection case showcased in the documentation (#371) * Noise (md) probe that identify malformed arabic representation due to the presence of letters in isolated form * Optional mypyc compilation upgraded to version 1.6.1 for Python >= 3.8 * Improved the general detection reliability based on reports from the community
buildservice-autocommit
accepted
request 1114778
from
Dirk Mueller (dirkmueller)
(revision 42)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 41)
- update to 3.3.0: * Allow to execute the CLI (e.g. normalizer) through `python -m charset_normalizer.cli` or `python -m charset_normalizer` * Support for 9 forgotten encoding that are supported by Python but unlisted in `encoding.aliases` as they have no alias * Optional mypyc compilation upgraded to version 1.5.1 for Python >= 3.7 * Unable to properly sort CharsetMatch when both chaos/noise and coherence were close due to an unreachable condition in \_\_lt\_\_ (#350) - Update to 3.0.1 - Update to 3.0.0 * ASCII miss-detection on rare cases (PR #170) * Wrong logging level applied when setting kwarg `explain` to True - require lower-case name instead of breaking build
buildservice-autocommit
accepted
request 1098807
from
Dirk Mueller (dirkmueller)
(revision 40)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 39)
- update to 3.2.0: * Typehint for function `from_path` no longer enforce `PathLike` as its first argument * Minor improvement over the global detection reliability * Introduce function `is_binary` that relies on main capabilities, and optimized to detect binaries * Propagate `enable_fallback` argument throughout `from_bytes`, `from_path`, and `from_fp` that allow a deeper control over the detection (default True) * Edge case detection failure where a file would contain 'very- long' camel cased word (Issue #289)
buildservice-autocommit
accepted
request 1084939
from
Dirk Mueller (dirkmueller)
(revision 38)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 37)
- add sle15_python_module_pythons (jsc#PED-68)
buildservice-autocommit
accepted
request 1074517
from
Dirk Mueller (dirkmueller)
(revision 36)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 35)
- update to 3.1.0: * Argument `should_rename_legacy` for legacy function `detect` and disregard any new arguments without errors (PR #262) * Removed Support for Python 3.6 (PR #260) * Optional speedup provided by mypy/c 1.0.1
buildservice-autocommit
accepted
request 1039740
from
Dirk Mueller (dirkmueller)
(revision 34)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
accepted
request 1039709
from
Yogalakshmi Arunachalam (yarunachalam)
(revision 33)
- Update to 3.0.1 Fixed Multi-bytes cutter/chunk generator did not always cut correctly (PR #233) Changed Speedup provided by mypy/c 0.990 on Python >= 3.7
buildservice-autocommit
accepted
request 1032182
from
Matej Cepl (mcepl)
(revision 32)
baserev update by copy to link target
Matej Cepl (mcepl)
accepted
request 1031656
from
Yogalakshmi Arunachalam (yarunachalam)
(revision 31)
- Update to 3.0.0 Added * Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio normalizer --version now specify if current version provide extra speedup (meaning mypyc compilation whl) * Changed Build with static metadata using 'build' frontend Make the language detection stricter Optional: Module md.py can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1 * Fixed CLI with opt --normalize fail when using full path for files TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it Sphinx warnings when generating the documentation * Removed Coherence detector no longer return 'Simple English' instead return 'English' Coherence detector no longer return 'Classical Chinese' instead return 'Chinese' Breaking: Method first() and best() from CharsetMatch UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflict with ASCII) Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches Breaking: Top-level function normalize Breaking: Properties chaos_secondary_pass, coherence_non_latin and w_counter from CharsetMatch Support for the backport unicodedata2
buildservice-autocommit
accepted
request 1004361
from
Dirk Mueller (dirkmueller)
(revision 30)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 29)
- update to 2.1.1: * Function `normalize` scheduled for removal in 3.0 * Removed useless call to decode in fn is_unprintable (#206)
buildservice-autocommit
accepted
request 998090
from
Dirk Mueller (dirkmueller)
(revision 28)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
accepted
request 998013
from
Benjamin Greiner (bnavigator)
(revision 27)
- Clean requirements: We don't need anything
buildservice-autocommit
accepted
request 991152
from
Dirk Mueller (dirkmueller)
(revision 26)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 25)
- update to 2.1.0: * Output the Unicode table version when running the CLI with `--version` * Re-use decoded buffer for single byte character sets * Fixing some performance bottlenecks * Workaround potential bug in cpython with Zero Width No-Break Space located * in Arabic Presentation Forms-B, Unicode 1.1 not acknowledged as space * CLI default threshold aligned with the API threshold from * Support for Python 3.5 (PR #192) * Use of backport unicodedata from `unicodedata2` as Python is quickly catching up, scheduled for removal in 3.0
Displaying revisions 1 - 20 of 44