Commit fd27561b authored by psychocrypt's avatar psychocrypt
Browse files

NVIDIA: optimze v8

- fix that shared memory for fast div is always used even if an algorithm is not using it
- optimize fast div algo
- store `division_result` (64_bit) per thread instead of shuffle around and store it as 32bit
parent 659918f2
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment