Revisions of openucx

buildservice-autocommit accepted request 1151438 from Christian Goll's avatar Christian Goll (mslacken) (revision 66)
baserev update by copy to link target
buildservice-autocommit accepted request 1116008 from Nicolas Morey's avatar Nicolas Morey (NMorey) (revision 64)
baserev update by copy to link target
Nicolas Morey's avatar Nicolas Morey (NMorey) accepted request 1115979 from Nicolas Morey's avatar Nicolas Morey (NMorey) (revision 63)
- Update to 1.15.0
  - UCP
    - Added 2-stage pipeline protocol in the new protocol infrastructure
    - Added reset and abort functionality of rendezvous protocols in the
       new infrastructure
    - Added zero-copy rendezvous data send protocol in the new infrastructure
    - Added support for user memory handle in the new protocol infrastructure
    - Added option to force ODP registration for certain memory types
    - Enabled lock free memory region deregistration
    - Updated allow/deny transport list feature to control auxiliary transport selection
    - Multiple performance improvements of the new protocol infrastructure
    - Multiple improvements in error and debug messages
    - Fixed assertion when sending from non-contiguous GPU buffer to managed buffer
    - Fixed the race condition on endpoint configurations
    - Fixed endpoint reconfiguration issues due to asymmetrical selection
    - Fixed endpoint reconfiguration error due to wrong locality detection
    - Fixed crash during connection manager cleanup
    - Fixed rkey index calculation for rendezvous protocol
    - Fixed rcache dump function
    - Removed logging from rkey unpack in release mode
    - Fixed dobule free of rkey in rendezvous protocol
    - Fixed rendezvous pipeline protocol error flow
    - Fixed error handling in rendezvous get zcopy protocol
    - Replay pending requests of wireup EP CM during connection establishment
      to prevent potential ordering issues and wrong configuration
    - Pass user-provided memory type to the function that checks whether the buffer
      can be sent inline or not
    - Avoid memory registration during UCP context initialization
    - Fixed CPU/device atomics selection in the new protocol infrastructure
    - Multiple fixes in the new protocol infrastructure information output
buildservice-autocommit accepted request 1100646 from Nicolas Morey's avatar Nicolas Morey (NMorey) (revision 62)
baserev update by copy to link target
Nicolas Morey's avatar Nicolas Morey (NMorey) accepted request 1100640 from Nicolas Morey's avatar Nicolas Morey (NMorey) (revision 61)
- Update to v1.14.1
  - Fixed ROCm to prevent the locking of host pinned memory
  - Added CUDA 12 based UCX builds to the release flow
  - Increased the maximal number of endpoint configurations
  - Fixed filter for a slow-lanes in selection logic
  - Fixed TCP transport bandwidth calculation
  - Fixed device detection for ROCM
  - Fixed compatibility with CUDA 12
  - Fixed rendezvous threshold for multi-path configurations
  - Fixed error message in case of static link
  - Fixed BlueField-3 detection
  - Multiple fixes for Azure CI pipeline
buildservice-autocommit accepted request 1075600 from Nicolas Morey's avatar Nicolas Morey (NMorey) (revision 60)
baserev update by copy to link target
Nicolas Morey's avatar Nicolas Morey (NMorey) committed (revision 59)
Remove remaining gcc13 patch
Nicolas Morey's avatar Nicolas Morey (NMorey) committed (revision 58)
- Add gcc13-fix.patch for GCC13 support
Nicolas Morey's avatar Nicolas Morey (NMorey) accepted request 1075167 from Nicolas Morey's avatar Nicolas Morey (NMorey) (revision 57)
- Update to v1.14.0
  - UCP
    - Added API for querying transport and device names on endpoint
    - Added API for querying datatype object
    - Added API for exporting and importing memory keys (no implementation yet)
    - Added support for non-persistent active message header
    - Added infrastructure to print protocols v2 performance
    - Multiple performance improvements for protocols v2
    - Added support for non-contiguous datatypes for rendezvous protocols v2
    - Added support for reset and abort request in protocols v2
    - Added support for user memory handles in RMA API
    - Added multi-rail support for RMA API in protocols v2
    - Added support for up to 16 different lanes per endpoint
    - Added support for dmabuf memory registration in protocols v2
    - Added strong fence mode for ucp_worker_fence() API
  - UCT
    - Added new uct_md_mem_attach() API to support exported memory handles
    - Added remote completion mode for endpoint flush (via new flag)
    - Added support for dmabuf registration
    - Added new uct_ep_connect_to_ep_v2() API
    - Added new uct_mem_reg_v2() API
    - Added new uct_md_query_v2() API
    - Added support for IPv6 loopback address in TCP transport
  - RDMA CORE (IB, ROCE, etc.)
    - Added ECE (enhanced connection establishment) support for RC and DC transports
    - Added support for hardware DCS in DC transport
    - Added UD interface and endpoint resource information to VFS
    - Added CQ creation via DEVX API
    - Removed support for accelerated IB transports over legacy experimental verbs
  - UCS
    - Added support for auto-correction of user environment variables
  - UCM
    - Implemented CUDA bistro hooks for aarch64 (to enable memory cache on this platform)
    - Added support for CUDA virtual/stream-ordered memory with cudaMallocAsync
  - Documentation
    - Added FAQ for using pkg-config tool to build applications with UCX
  - Tools
    - Added runtime library version to the 'ucx_info -v' output
    - Added support for memory types in ucx_info
  - Many bugfixes. See NEWS.
- Drop patch merged upstream:
  - UCS-DEBUG-replace-PTR-with-void.patch
  - gcc13-fix.patch
- Refresh openucx-s390x-support.patch
buildservice-autocommit accepted request 1069629 from Jan Engelhardt's avatar Jan Engelhardt (jengelh) (revision 56)
baserev update by copy to link target
Jan Engelhardt's avatar Jan Engelhardt (jengelh) accepted request 1069627 from Martin Liška's avatar Martin Liška (marxin) (revision 55)
- Add upstream gcc13-fix.patch fix.
buildservice-autocommit accepted request 1058681 from Jan Engelhardt's avatar Jan Engelhardt (jengelh) (revision 54)
baserev update by copy to link target
Jan Engelhardt's avatar Jan Engelhardt (jengelh) accepted request 1058654 from Andreas Schwab's avatar Andreas Schwab (Andreas_Schwab) (revision 53)
- openucx-s390x-support.patch: fix use of clz builtin for 64-bit value
buildservice-autocommit accepted request 1008219 from Nicolas Morey-Chaisemartin's avatar Nicolas Morey-Chaisemartin (NMoreyChaisemartin) (revision 52)
baserev update by copy to link target
Nicolas Morey-Chaisemartin's avatar Nicolas Morey-Chaisemartin (NMoreyChaisemartin) accepted request 1008118 from Nicolas Morey-Chaisemartin's avatar Nicolas Morey-Chaisemartin (NMoreyChaisemartin) (revision 51)
- Drop baselibs.conf as openucx only works on 64b systems
Nicolas Morey-Chaisemartin's avatar Nicolas Morey-Chaisemartin (NMoreyChaisemartin) accepted request 1008115 from Nicolas Morey-Chaisemartin's avatar Nicolas Morey-Chaisemartin (NMoreyChaisemartin) (revision 50)
- Update openucx-s390x-support.patch to add missing ucs_ffs32 on s390x
buildservice-autocommit accepted request 1007003 from Nicolas Morey-Chaisemartin's avatar Nicolas Morey-Chaisemartin (NMoreyChaisemartin) (revision 49)
baserev update by copy to link target
Nicolas Morey-Chaisemartin's avatar Nicolas Morey-Chaisemartin (NMoreyChaisemartin) accepted request 1006486 from Nicolas Morey-Chaisemartin's avatar Nicolas Morey-Chaisemartin (NMoreyChaisemartin) (revision 48)
- Update to v1.13.1 (jsc#PED-912)
  - Core
    - Added new objects to VFS: local and remote address of endpoint,
      statistics of ucp_ep_create success/failure, failed/destroyed endpoints
    - Added support for UCX static libraries
    - Added profiling for rkey management routines
    - PCIe relaxed order enabled by default for AMD CPUs
    - Fixed not deallocating memory from ucp_mem_unmap if no rcache
    - Fixed versioning infrastructure
    - Multiple code improvements: refactoring, debug prints and assertions, etc.
    - Multiple improvements in build, test and docs infrastructure
    - Added new objects to VFS (md, component, log_level, etc.)
    - Added configuration variable to specify which loadable modules are allowed
    - Added build-time configuration to disable sigaction overriding
  - UCP
    - Added API to pass pre-registered memory handle to UCP operations
    - Added implementation of AM rendezvous protocol
    - Added 2-stage pipeline rendezvous protocol for GPU
    - Added support for fragment mem_type for v1 pipeline proto, disabled by default
    - Added active message support for proto v2
    - Added UCP memory registration cache
    - Improved adaptive progress - deactivate iface when all p2p lanes are destroyed
    - Added support for user memh in proto_v1
    - Added support for selecting local address when creating a client endpoint
    - Added option to limit GPUDirectRDMA size in rendezvous protocol, UCX_RNDV_MEMTYPE_DIRECT_SIZE
    - Deprecated UCX_SOCKADDR_AUX_TLS configuration parameter
    - Resolving remote EP ID when creating local EP disabled by default
    - Added client_id to ucp_worker_create() and ucp_conn_request_query() APIs
    - Added ucp_worker_address_query() API
    - Updated ucp_ep_query() API for getting local and remote addresses
    - Added address versioning to correctly preserve wire compatibility starting from version 1.11.0
    - Added new client/server connection establishment packet header format
    - Enabled rendezvous and tag sync protocols when error handling is enabled on the endpoint
    - Added iov zcopy support to RMA operations
    - Reduced memory usage of unexpected messages by fitting receive buffer size to packet size
    - Added support for modifying UCT and UCS configs by ucp_config_modify() API
    - Optimized unpacked rkeys memory consumption
    - Added request flag to influence latency vs. bandwidth protocol
    - Reduced memory management overhead with new protocols
    - Improved performance calculations for new protocols
    - Added AMO support with GPU memory target using new protocols
    - Added put_zcopy, get_zcopy and pipeline based rendezvous in new protocols
    - Added support for user-defined alignment in Active Messages
    - Added support for offload tag sync in new protocols
    - Updated ucp_atomic_post() to use NBX flow
  - UCT
    - Introduced API uct_md_mkey_pack_v2
    - Introduced UCT iface features API
    - Introduced max_inflight_eps parameter in perf_attr API
    - Introduced UCT_SEND_FLAG_PEER_CHECK flag that forces checking connectivity to a peer
    - Introduced UCX_RCACHE_PURGE_ON_FORK to enable/disable cleaning regions when application is forking
    - Disabled PEER_FAILURE capability for XPMEM
    - Added API - uct_iface_is_reachable_v2()
    - Added IPv6 address support in TCP
    - Added latency estimation to uct_iface_estimate_perf()
    - Adjusted knem and cma overhead cost
    - Increased built-in TCP keep-alive interval to 2 seconds
  - RDMA CORE (IB, ROCE, etc.)
    - Introduced NDR autorecognition
    - Introduced CQE zipping support
    - Set the default MAX_RD_ATOMIC to maximum value supported by the hardware
    - Disabled mlx5 ifaces on verbs MD
    - Added detection of IB NDR devices
    - Added check for CQ overrun in assert mode
    - Added bitmap usage for releasing detached DCIs
    - Added configuration for requests ack frequency with DevX
    - Added remote QP info to tx error CQE traces
  - ROCM
    - Increased maximum number of HSA agents
  - UCS
    - Added topo module infrastructure
    - Added memtrack and rcache information to VFS
    - Added API for a per-process aggregate-sum statistics report
    - Added memory pool set data structure
    - Added new ptr_array API for bulk allocation
    - Added ucs_string_buffer_append_flags() for string buffer
    - Added ucs_ffs32()
    - Added ucs_vsnprintf_safe() which always adds '\0'
    - Added thread-safe put to ptr_map
    - Improved accuracy of the topology distance estimation
    - Added prints of leaked callbacks from the callback queue
    - Removed a diagnostic message when fuse thread is stopped
    - Added configurable limit for the memory consumed by rcache
    - Added configuration for VFS(FUSE) thread affinity
    - Added memory limit support to memtrack
  - Packaging
    - Added cmake config files for better integration with external cmake based projects
  - Tools
    - Added loop-back transport support in ucx_perftest
    - Split ucx_perftest into separate modules
    - Added process placement option for ucx_info
    - Extended parameters correctness check in ucx_perftest
- Backported UCS-DEBUG-replace-PTR-with-void.patch
  from upstream to fix compilation
Nicolas Morey-Chaisemartin's avatar Nicolas Morey-Chaisemartin (NMoreyChaisemartin) accepted request 946104 from Nicolas Morey-Chaisemartin's avatar Nicolas Morey-Chaisemartin (NMoreyChaisemartin) (revision 47)
- Fix UCM bistro support on non s390x archs
- Add ucm-fix-UCX_MEM_MALLOC_RELOC.patch to disable malloc relocations by default (bsc#1194369)
Displaying revisions 1 - 20 of 66
openSUSE Build Service is sponsored by