Revisions of openblas
Ismail Dönmez (namtrac)
committed
(revision 114)
- Update to version 0.3.13 common: * Added a generic bfloat16 SBGEMV kernel * Fixed a potentially severe memory leak after fork in OpenMP builds that was introduced in 0.3.12 * Added detection of the Fujitsu Fortran compiler * Added detection of the (e)gfortran compiler on OpenBSD * Added support for overriding the default name of the library independently from symbol suffixing in the gmake builds (already supported in cmake) RISC V: * Added a RISC V port optimized for C910V POWER: * Added optimized POWER10 kernels for SAXPY, CAXPY, SDOT, DDOT and DGEMV_N * Improved DGEMM performance on POWER10 * Improved STRSM and DTRSM performance on POWER9 and POWER10 * Fixed segmemtation faults in DYNAMIC_ARCH builds * Fixed compilation with the PGI compiler x86: * Fixed compilation of kernels that require SSE2 intrinsics since 0.3.12 x86_64: * Added an optimized bfloat16 SBGEMV kernel for SkylakeX and Cooperlake * Improved the performance of SASUM and DASUM kernels through parallelization * Improved the performance of SROT and DROT kernels * Improved the performance of multithreaded xSYRK * Fixed OpenMP builds that use the LLVM Clang compiler together with GNU gfortran (where linking of both the LLVM libomp and GNU libgomp could lead to lockups or wrong results) * Fixed miscompilations by old gcc 4.6 * Fixed misdetection of AVX2 capability in some Sandybridge cpus
Ismail Dönmez (namtrac)
accepted
request 856522
from
Dominique Leuenberger (dimstar)
(revision 113)
- Fix invalid symlinks (boo#1179764).
buildservice-autocommit
accepted
request 843798
from
Ismail Dönmez (namtrac)
(revision 112)
baserev update by copy to link target
Ismail Dönmez (namtrac)
committed
(revision 111)
-
Ismail Dönmez (namtrac)
committed
(revision 110)
-
Ismail Dönmez (namtrac)
committed
(revision 109)
- Update to version 0.3.12 common: * Fixed missing BLAS/LAPACK functions (inadvertently dropped during the build system restructuring to support selective compilation) * Fixed argument conversion macro in LAPACKE_zgesvdq (LAPACK #458) power: * Added optimized SCOPY/CCOPY kernels for POWER10 * Increased and unified the default size of the GEMM buffer * Fixed building for POWER10 in DYNAMIC_ARCH mode * POWER10 compatibility test now checks binutils version as well * Cleaned up compiler warnings x86_64: * Corrected compiler version checks for AVX2 compatibility * Added compiler option -mavx2 for building with flang * Fixed direct SGEMM pathway for small matrix sizes (broken by the code refactoring in 0.3.11) * Fixed unhandled partial register clobbers in several kernels for AXPY,DOT,GEMV_N and GEMV_T flagged by gcc10 tree-vectorizer armv8: * Improved Apple Vortex support to include cross-compiling
buildservice-autocommit
accepted
request 843166
from
Ismail Dönmez (namtrac)
(revision 108)
baserev update by copy to link target
Ismail Dönmez (namtrac)
committed
(revision 107)
-
Ismail Dönmez (namtrac)
committed
(revision 106)
-
Ismail Dönmez (namtrac)
committed
(revision 105)
-
Ismail Dönmez (namtrac)
committed
(revision 104)
-
Ismail Dönmez (namtrac)
committed
(revision 103)
-
Ismail Dönmez (namtrac)
committed
(revision 102)
- Update to version 0.3.11 common: * Reduced the default BLAS3_MEM_ALLOC_THRESHOLD (used as an upper limit for placing temporary arrays on the stack) to be compatible with a stack size of 1mb (as imposed by the JAVA runtime library) * Added mixed-precision dot function SBDOT and utility functions shstobf16, shdtobf16, sbf16tos and dbf16tod to convert between single or double precision float arrays and bfloat16 arrays * Fixed prototypes of LAPACK_?ggsvp and LAPACK_?ggsvd functions in lapack.h * Fixed underflow and rounding errors in LAPACK SLANV2 and DLANV2 (causing miscalculations in e.g. SHSEQR/DHSEQR, LAPACK issue #263) * Fixed workspace calculation in LAPACK ?GELQ (LAPACK issue #415) * Fixed several bugs in the LAPACK testsuite * Improved performance of TRMM and TRSM for certain problem sizes * Fixed infinite recursions and workspace miscalculations in ReLAPACK * CMAKE builds no longer require pkg-config for creating the .pc file * Makefile builds no longer misread NO_CBLAS=0 or NO_LAPACK=0 as enabling these options * Fixed detection of gfortran when invoked through an mpi wrapper * Improve thread reinitialization performance with OpenMP after a fork * Added support for building only the subset of the library required for a particular precision by specifying BUILD_SINGLE, BUILD_DOUBLE * Optional function name prefixes and suffixes are now correctly reflected in the generated cblas.h * Added CMAKE build support for the LAPACK and multithreading tests power: * Added optimized support for POWER10 * Added support for compiling for POWER8 in 32bit mode * Added support for compilation with LLVM/clang
buildservice-autocommit
accepted
request 839313
from
Egbert Eich (eeich)
(revision 101)
baserev update by copy to link target
Egbert Eich (eeich)
accepted
request 839300
from
Egbert Eich (eeich)
(revision 100)
- Set DYNAMIC_ARCH everywhere, use a base CPU model for non-dynamic bits to have a reproducible base line: x86_64: CORE2 aarch64: ARMV8 ppc: POWER8 s390: ZARCH_GENERIC - Remove workaround for build failure on aarch64 (boo#1128794).
buildservice-autocommit
accepted
request 837347
from
Ismail Dönmez (namtrac)
(revision 99)
baserev update by copy to link target
Ismail Dönmez (namtrac)
accepted
request 837203
from
Egbert Eich (eeich)
(revision 98)
- For s390/s390x add TARGET=ZARCH_GENERIC (jsc#SLE-13773).
buildservice-autocommit
accepted
request 833714
from
Egbert Eich (eeich)
(revision 97)
baserev update by copy to link target
Egbert Eich (eeich)
accepted
request 833599
from
Egbert Eich (eeich)
(revision 96)
- Add build support for gcc10 to HPC build (bsc#1174439).
buildservice-autocommit
accepted
request 825919
from
Ismail Dönmez (namtrac)
(revision 95)
baserev update by copy to link target
Displaying revisions 61 - 80 of 174