-
sech1tevador you might be interested: xmrig/xmrig #1986
-
tevadorsech1: is it basically due to faster mov? dataset init only happens twice per week anyways
-
sech1It's faster because it initializes 5 items at a time (1 item using integer registers, 4 items using AVX 256-bit registers)
-
sech1so it also has more time to prefetch data from cache
-
sech15 parallel fetches per thread
-
sech1but it's not faster on all CPUs...
-
sech1It would've been much much faster if there were 64-bit multiplication instructions for AVX...
-
sech1Basically main loop is ~4 times slower than the regular loop, but it initializes 5 elements
-
hycolder CPUs with only 128bit pathways for AVX
-
tevadorI guess I could try to test AVX2 with HashX, it could be a bit faster there since it doesn't use mulh