Revisions of python-dask
buildservice-autocommit
accepted
request 1171090
from
Dirk Mueller (dirkmueller)
(revision 157)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
accepted
request 1170944
from
Benjamin Greiner (bnavigator)
(revision 156)
- Update to 2024.4.2 * Trivial Merge Implementation * Auto-partitioning in read_parquet - Release 2024.4.1 * Fix an error when importing dask.dataframe with Python 3.11.9. - Release 2024.4.0 * Query planning fixes * GPU metric dashboard fixes - Release 2024.3.1 * Demote an exception to a warning if dask-expr is not installed when upgrading. - Release 2024.3.0 * Query planning * Sunset of Pandas 1.X support
buildservice-autocommit
accepted
request 1155503
from
Matej Cepl (mcepl)
(revision 155)
baserev update by copy to link target
Matej Cepl (mcepl)
accepted
request 1155359
from
Benjamin Greiner (bnavigator)
(revision 154)
- Update to 2024.2.1 * Allow silencing dask.DataFrame deprecation warning * More robust distributed scheduler for rare key collisions * More robust adaptive scaling on large clusters - The test subpackage now directly depends on pandas-test which does not use pytest-asyncio anymore
buildservice-autocommit
accepted
request 1146835
from
Matej Cepl (mcepl)
(revision 153)
baserev update by copy to link target
Matej Cepl (mcepl)
accepted
request 1146758
from
Benjamin Greiner (bnavigator)
(revision 152)
- Update to 2024.2.0 * Deprecate Dask DataFrame implementation * Improved tokenization * https://docs.dask.org/en/stable/changelog.html#v2024-2-0 - Really drop python39 from testing instead of testing it with every other test flavor
Dirk Mueller (dirkmueller)
committed
(revision 151)
- add testing for py312, remove py39 RangeIndex the "c" and "python" engines * Bug in Series.str.split() and Series.str.rsplit() with expand=True (GH42915) * Fixed regression in DataFrame.groupby.rolling.cov() and * The deprecated attributes _AXIS_NAMES and _AXIS_NUMBERS of * Bumped minimum fastparquet version to 0.4.0 to avoid * Bumped minimum pymysql version to 0.8.1 to avoid test failures * Added reference to backwards incompatible check_freq arg of testing.assert_frame_equal() and testing.assert_series_equal() * WARNING: no longer maintained. The xlrd package is now only for reading Previously, the default argument engine=None to read_excel() would result in using the xlrd engine in many cases, including new Excel 2007+ (.xlsx) files. If openpyxl is installed, many of these cases will now default to using the openpyxl engine. Thus, it is strongly encouraged to install openpyxl to read Excel 2007+ (.xlsx) files. Please do not report issues when using ``xlrd`` to read ``.xlsx`` files. This is no longer supported, switch to using openpyxl instead. Attempting to use the xlwt engine will raise a FutureWarning unless the option io.excel.xls.writer is set to "xlwt". While this option is now deprecated and will also raise a FutureWarning, it can be globally set and the warning suppressed. Users are recommended to write .xlsx files using * Change in default floating precision for read_csv and gh#pandas-dev/pandas#34991 pandas-pr34991-npconstructor.patch * DataFrame.plot keywords logy, logx and loglog can now accept the value 'sym' for symlog scaling. * Added support for ISO week year format ('%G-%V-%u') when parsing datetimes using to_datetime
Dirk Mueller (dirkmueller)
accepted
request 1144172
from
Benjamin Greiner (bnavigator)
(revision 150)
- Add python312 test flavor
buildservice-autocommit
accepted
request 1142781
from
Dirk Mueller (dirkmueller)
(revision 149)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 148)
- update to 2024.1.1: * This release contains compatibility updates for the latest pandas and scipy releases. See :pr:`10834`, :pr:`10849`, :pr:`10845`, and :pr-distributed:`8474` from `crusaderky`_ for details.
buildservice-autocommit
accepted
request 1140136
from
Dirk Mueller (dirkmueller)
(revision 147)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 146)
- update to 2024.1.0: * Released on January 12, 2024 * P2P rechunking now utilizes the relationships between input and output chunks. For situations that do not require all-to- all data transfer, this may significantly reduce the runtime and memory/disk footprint. It also enables task culling. * The fastparquet Parquet engine has been deprecated. Users should migrate to the pyarrow engine by installing PyArrow and removing engine="fastparquet" in read_parquet or to_parquet calls. * This release improves serialization robustness for arbitrary data. Previously there were some cases where serialization could fail for non-msgpack serializable data. In those cases we now fallback to using pickle. * Deprecate shuffle keyword in favour of shuffle_method for DataFrame methods (:pr:`10738`) `Hendrik Makait`_ * Deprecate automatic argument inference in repartition * Deprecate compute parameter in set_index * Deprecate inplace in eval * Deprecate Series.view * Deprecate npartitions="auto" for set_index & sort_values
buildservice-autocommit
accepted
request 1135096
from
Factory Maintainer (factory-maintainer)
(revision 145)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 144)
- update to 2023.12.1: * Dask DataFrames are now much more performant by using a logical query planner. * ``read_parquet`` will now infer the Arrow types ``pa.date32()``, ``pa.date64()`` and ``pa.decimal()`` as a ``ArrowDtype`` in pandas. These dtypes are backed by the original Arrow array, and thus avoid the conversion to NumPy object. * This release contains several updates that fix a possible deadlock introduced in 2023.9.2 and improve the robustness of P2P-based merging when the cluster is dynamically scaling up. * The ``distributed.scheduler.pickle`` configuration option is no longer supported. As of the 2023.4.0 release, ``pickle`` is used to transmit task graphs, so can no longer be disabled. We now raise an informative error when ``distributed.scheduler.pickle`` is set to ``False``. * Update DataFrame page * Add changelog entry for ``dask-expr`` switch * [Dask.order] Remove non-runnable leaf nodes from ordering * Update installation docs * Fix software environment link in docs * Avoid converting non-strings to arrow strings for read_parquet * Dask.order rewrite using a critical path approach * Avoid substituting keys that occur multiple times * Add missing image to docs * Update landing page * Make meta check simpler in dispatch * Pin PR Labeler * Reorganize docs index a bit
buildservice-autocommit
accepted
request 1132242
from
Factory Maintainer (factory-maintainer)
(revision 143)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
committed
(revision 142)
- update to 2023.12.0: * Bokeh 3.3.0 compatibility * Add ``network`` marker to ``test_pyarrow_filesystem_option_real_data`` * Bump GPU CI to CUDA 11.8 (:pr:`10656`) * Tokenize ``pandas`` offsets deterministically * Add tokenize ``pd.NA`` functionality * Update gpuCI ``RAPIDS_VER`` to ``24.02`` (:pr:`10636`) * Fix precision handling in ``array.linalg.norm`` (:pr:`10556`) `joanrue`_ * Add ``axis`` argument to ``DataFrame.clip`` and ``Series.clip`` (:pr:`10616`) `Richard (Rick) Zamora`_ * Update changelog entry for in-memory rechunking (:pr:`10630`) `Florian Jetter`_ * Fix flaky ``test_resources_reset_after_cancelled_task`` * Bump GPU CI to CUDA 11.8 * Bump ``conda-incubator/setup-miniconda`` * Add debug logs to P2P scheduler plugin * ``O(1)`` access for ``/info/task/`` endpoint * Remove stringification from shuffle annotations * Don't cast ``int`` metrics to ``float`` * Drop asyncio TCP backend * Add offload support to ``context_meter.add_callback`` * Test that ``sync()`` propagates contextvars * Fix ``test_statistical_profiling_cycle`` * Replace ``Client.register_plugin`` s ``idempotent`` argument with ``.idempotent`` attribute on plugins * Fix test report generation * Install ``pyarrow-hotfix`` on ``mindeps-pandas`` CI * Reduce memory usage of scheduler process - optimize
buildservice-autocommit
accepted
request 1127184
from
Ondřej Súkup (mimi_vx)
(revision 141)
baserev update by copy to link target
Ondřej Súkup (mimi_vx)
accepted
request 1127183
from
Ondřej Súkup (mimi_vx)
(revision 140)
- Update to 2023.11.0 * Zero-copy P2P Array Rechunking * Deprecating PyArrow <14.0.1 * Improved PyArrow filesystem for Parquet * Improve Type Reconciliation in P2P Shuffling * official support for Python 3.12 * Reduced memory pressure for multi array reductions * improved P2P shuffling robustness * Reduced scheduler CPU load for large graphs
buildservice-autocommit
accepted
request 1110218
from
Dirk Mueller (dirkmueller)
(revision 139)
baserev update by copy to link target
Dirk Mueller (dirkmueller)
accepted
request 1110164
from
Benjamin Greiner (bnavigator)
(revision 138)
- Update to 2023.9.1 ## Enhancements * Stricter data type for dask keys (GH#10485) crusaderky * Special handling for None in DASK_ environment variables (GH#10487) crusaderky ## Bug Fixes - Release 2023.9.0 ## Bug Fixes * Remove support for np.int64 in keys (GH#10483) crusaderky * Fix _partitions dtype in meta for shuffling (GH#10462) Hendrik Makait * Don’t use exception hooks to shorten tracebacks (GH#10456) crusaderky - Release 2023.8.1 ## Enhancements * Adding support for cgroup v2 to cpu_count (GH#10419) Johan Olsson * Support multi-column groupby with sort=True and split_out>1 (GH#10425) Richard (Rick) Zamora * Add DataFrame.enforce_runtime_divisions method (GH#10404) Richard (Rick) Zamora * Enable file mode="x" with a single_file=True for Dask DataFrame to_csv (GH#10443) Genevieve Buckley ## Bug Fixes * Fix ValueError when running to_csv in append mode with single_file as True (GH#10441) - Release 2023.8.0 ## Enhancements * Fix for make_timeseries performance regression (GH#10428) Irina Truong
Displaying revisions 1 - 20 of 157