- Feb 11, 2019
-
-
psychocrypt authored
#2229 was not solving the issues - revert #2229 - introduce the working fix
-
psychocrypt authored
The OpenCl version of the blockchain driver is not understanding if apointer to a pointer points into shared memory and throw an error during the compilation. - revert the usage of the struct to group all shared memory arrays
-
- Feb 10, 2019
-
-
psychocrypt authored
Add seperate kernel to prepare the scratchpad memory.
-
psychocrypt authored
Combine the shared memory for a hash within one struct. Reduce the shared memory footprint per hash by 64 byte.
-
psychocrypt authored
- rename variable names like `b` and `bb` to something with a little bit of meaning.
-
- Feb 09, 2019
-
-
psychocrypt authored
Optimize cn_gpu
-
psychocrypt authored
-
psychocrypt authored
based on the suggestion from @xmrig https://github.com/xmrig/xmrig-amd/commit/db4e169f3a78f273abf89ea8cf5bba7eccf1490b
-
- Feb 07, 2019
-
-
psychocrypt authored
cryptonight_turtle is only cryptonight_v8 with a different scratchpad, iteration and mask value. We are using now the new machanism to describe such derived POWs.
-
psychocrypt authored
@xmrig provided the information that the driver 19.2.1 for vega also create invalid results if pragma unroll is used for the groestl algo.
-
- Feb 06, 2019
-
-
psychocrypt authored
- use the user defined unroll - auto suggestion: - only tune for cn_gpu if this is the main user currency (after a fork) - set unroll to 1 for cn_gpu
-
- Feb 04, 2019
-
-
psychocrypt authored
If comp_mode is used the code will not compile. - fix compile issue - fix wrong conditions to handle `comp_mode`
-
- Feb 02, 2019
-
-
psychocrypt authored
Windows driver creates wrong code if unroll is used.
-
- Feb 01, 2019
-
-
psychocrypt authored
Use the algorithm names from `cryptonight.hpp` instead if number within the OpenCL kernel.
-
- Jan 30, 2019
-
-
psychocrypt authored
- fix broken trutle coin - fix non cn_gpu algorithms
-
fireice-uk authored
Co-authored-by:
psychocrypt <psychocryptHPC@gmail.com> Co-authored-by:
fireice-uk <fireice-uk@users.noreply.github.com>
-
- Jan 25, 2019
-
-
Brandon Lehmann authored
-
- Dec 06, 2018
-
-
psychocrypt authored
Since #2080 bittube2 is broken. - reintroduce special AES function for bittube2
-
- Dec 03, 2018
-
-
psychocrypt authored
NVIDIA is using clang as device compiler so the reciprocal optimizations was disabled with #2104. - re-enable optimized reciprocal calculation
-
- Dec 02, 2018
-
-
psychocrypt authored
- fix broken compile: change used `ULL` to `UL` because `UL` is defined as 64bit - fix memory copy to shared memory via vload8 (somehow it create wrong access)
-
- Nov 30, 2018
-
-
psychocrypt authored
use for non clang (Rocm) OpenCL a optimized reciprocal calculation without lookup table. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
- Nov 29, 2018
-
-
LPHuynh authored
-
- Nov 21, 2018
-
-
psychocrypt authored
Use `mul24` to speedup the scratchpad index calculation. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
Add new striding index where the memory is chunked by the size of the work group (worksize). Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
small optimization for non cryptonight_v8 algorithms
-
- Nov 20, 2018
-
-
SChernykh authored
- optimize division
-
SChernykh authored
optimize cryptonight_heavy diff
-
psychocrypt authored
- change a few 64bit variables into 32bit. - provide defines type quallified
-
- Nov 19, 2018
-
-
psychocrypt authored
- remove useless `clFinish` - avoid download num threads for skein&co and start always as much threads as in all other kernel (terminate useless threads)
-
psychocrypt authored
Reduce local memory foot print to increase the occupancy. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
- Nov 16, 2018
-
-
psychocrypt authored
define shared memory in the outer scope
-
SChernykh authored
x-ref: https://github.com/xmrig/xmrig-amd/pull/192
-
SChernykh authored
- optimize kernel cn0 and cn2 - optimize vast int math - use more 32bit variables Co-authored-by:
psychocrypt <psychocryptHPC@gmail.com>
-
- Nov 06, 2018
-
-
SChernykh authored
optimize the devision in cryptonight_heavy and cryptonight_haven import of https://github.com/xmrig/xmrig-amd/pull/185/commits/5d9b9334654df25cea7707f667990fd1577ed290
-
- Oct 16, 2018
-
-
psychocrypt authored
Fix the fix from #1945. The initial fix produces invalid results.
-
- Oct 15, 2018
-
-
psychocrypt authored
The AMD compiler for OpenCL shipped with the driver 14XX is broken and can not compile xmr-stak since the monero v8 changes are introduced. - workaround a simple compare. - add new device define `OPENCL_DRIVER_MAJOR`
-
- Oct 10, 2018
-
-
psychocrypt authored
In the current implementation the bit align is using signed integer which results in pulling in ones in the case the sign bit is set. - cast to unsigned integer before using bitshift
-
- Oct 05, 2018
-
-
psychocrypt authored
With rocm we fighted very long with invalid shares. This is now solved with rocm 1.9 and this tiny fix. It is not fully clear where a memory optimization is kicking in and break the kernel `Groestl` if the variables `M` and `H` are not `volatile`. The performance ill not change with this fix. The fix is tested with rocm 1.9 with a VEGA64 and a RX570
-
- Oct 04, 2018
-
-
Tony Butler authored
-
- Sep 30, 2018
-
-
psychocrypt authored
add cpu implementation for the final monero POW
-