-
sech1
-
sech1
1GB pages, HW prefetcher control on Intel
-
sech1
but only on Linux
-
Inge-
wait, does it not require bios changing of prefetcher behaviour?
-
sech1
No
-
sech1
it's command line switch now
-
sech1
-
sech1
It changes the same MSR register as BIOS does
-
Inge-
nice
-
sech1
1GB pages give +1-3% depending on CPU
-
sech1
+1% on Ryzen/Skylake
-
sech1
+3% on AMD Opterons
-
Inge-
so running as root and adding --randomx-wrmsr --randomx-1gb-pages should enable this stuff?
-
Inge-
or rather, don't need to run as root as long as huge pages have been configured ...
-
cohcho
It probably changes the same since there is no proof.
-
Inge-
hm. * 1GB PAGES disabled even with --randomx-1gb-pages
-
sech1
1GB pages are only available on Linux
-
sech1
and you need to configure them
-
sech1
not all CPUs support them
-
Inge-
this is a configuration change of how Huge PAges are set up then I presume?
-
sech1
The easiest way to enable 1GB pages on single CPU systems: GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=2M hugepagesz=1G hugepages=3"
-
sech1
and then sudo update-grub and reboot
-
Inge-
so checking /proc/meminfo after this, it is correct in still reporting 2M size?
-
Inge-
but if I run as root, it will configure a few 1GB pages
-
cohcho
check number of 1gb pages with `cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages`
-
sech1
or run "hugeadm --pool-list"
-
cohcho
There is no "hugeadm" here.
-
sech1
sudo apt install hugepages
-
sech1
or hugepage
-
sech1
I don't remember exact package name
-
Inge-
yeah found it
-
Inge-
c# cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
-
Inge-
7
-
Inge-
(dual cpu so I sent 6 there instead of 3)
-
sech1
And the correct path for NUMA is /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
-
sech1
node0, node1 and so on
-
Inge-
and node1 for a dual socket ... right
-
sech1
xmrig writes to this file to allocate 1GB pages, it should work under root
-
Inge-
hmm. still says disabled
-
Inge-
Waiting on an apt upgrade to complete so I can install hugepages package
-
sech1
What CPU is it?
-
sech1
And how exactly do you try to enable it?
-
Inge-
Dual E5-2670 (v1)
-
sech1
it should support 1GB pages
-
Inge-
modified /etc/default/grub and ran update-grub and rebooted
-
sech1
Xeon E3 series don't have support
-
Inge-
then ran sudo bash -c "echo 3 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages"
-
sech1
you also need to modify config.json
-
Inge-
and sudo bash -c "echo 3 > /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages"
-
sech1
find "1gb-pages" there and set it to true
-
Inge-
appended these to commandline: --randomx-wrmsr --randomx-1gb-pages
-
Inge-
not using config gile
-
Inge-
file*
-
Inge-
# hugeadm --pool-list
-
Inge-
Size Minimum Current Maximum Default
-
Inge-
2097152 2560 2560 2560 *
-
Inge-
1073741824 6 6 6
-
sech1
can you paste full miner console output?
-
Inge-
kk
-
Inge-
-
Inge-
commandline: ./xmrig -o pool.supportxmr.com:5555 -u wallet -p xyz:bvla⊙nn --coin monero --randomx-wrmsr --randomx-1gb-pages
-
sech1
well, command line doesn't work apparently :D
-
sech1
try with config.json
-
Inge-
hehe ok
-
sech1
take default one from src folder and enable it there
-
Inge-
hm. these cpu's are L2 cache limited it seems. unexpected
-
sech1
I think you forgot to add a number after wrmsr in the command line
-
sech1
try ./xmrig -o pool.supportxmr.com:5555 -u wallet -p xyz:bvla⊙nn --coin monero --randomx-wrmsr 6 --randomx-1gb-pages
-
sech1
yep, confirmed
-
sech1
you forgot to add 6 after --randomx-wrmsr
-
Inge-
a) it works with config.json
-
Inge-
b) adding the 6 also enables it from command line
-
sech1
nice
-
sech1
so what's the hashrate difference?
-
Inge-
it was not clear for me that I needed the 6 by looking at the releases here:
github.com/xmrig/xmrig/releases/tag/v5.2.0
-
Inge-
currently it looks like 0.5%
-
Inge-
6664.8 vs 6630
-
Inge-
note that I already did prefetcher changes in BIOS
-
sech1
Sandy Bridge Xeons should've got the most speedup from it. Maybe they're already limited by CPU and memory access is fast enough.
-
Inge-
now running without root, and it says [2019-12-11 10:20:03.610] rx cannot set MSR 0x01a4 to 0x0006
-
Inge-
not sure if it warns if those settings were already made in BIOS when run as non-root
-
Inge-
hr is identical
-
cohcho
xmrig doesn't revert msr value
-
cohcho
so it's 6 from previous run
-
Inge-
it still warns however
-
cohcho
It warns since read/write msr requires root and xmrig can't update or read msr.
-
Inge-
ah it can't read it either ...
-
sech1
Someone actually tried to mine with GPUs: 6xRX 580 - 2800 h/s, 400 W
youtu.be/DnLE48xi2k8?t=50
-
Inge-
Vega cards mining as we type: [2019-12-11 12:11:48.415] speed 10s/60s/15m 8811.7 8824.8 8828.7 H/s max 9318.2 H/s
-
wow-discord1
-
Inge-
I wonder if the 3950x can get higher with the msr and 1gb page tweaks:
-
Inge-
-
wow-discord1
<sech1> Yes it will. Prefetchers read more stuff from memory than needed so turning them off always improves hashrate.
-
cohcho
sech1, What was the default value of those registers?
-
wow-discord1
<sech1> I didn't check
-
wow-discord1
<sech1> I can reboot now and check
-
cohcho
How long have you bruteforced those registers to find 0x510000 and 0x1808cc16?
-
wow-discord1
<sech1> I didn't, they're from l1/l2 prefetcher options on some AMD BIOS
-
wow-discord1
<sech1> default values: MSR 0xC001102B = 0x2008CC17
-
wow-discord1
<sech1> MSR 0xC0011022=0xC00...0002500000
-
wow-discord1
<sech1> so it changes 2008CC17 -> 1008CC16
-
wow-discord1
<sech1> and 0x500000 -> 0x510000
-
wow-discord1
<sech1> xnbya found these options in BIOS and just took register values from there
-
cohcho
xnbya: Have you looked into firmware asm or just advanced options in UI?
-
sech1
-
sech1
#define MSR_DC_CFG 0xC0011022
-
sech1
so ..22 register is data cache config
-
sech1
#define MSR_BU_CFG3 0xC001102B
-
sech1
no idea how to decipher this abbreviation
-
sech1
but this is for older CPUs (Bulldozer etc.), probably Ryzen just has the same registers
-
sech1
-
sech1
MSRC001_1022 Data Cache Configuration (DC_CFG)
-
sech1
MSRC001_102B Combined Unit Configuration 3 (CU_CFG3)
-
sech1
1022, bit 13, set to11=Disable the DC hardware prefetcher.
-
sech1
1022, bit 4 is also interesting: 1=Disable speculative TLB reloads
-
sech1
ohhh my gawwd, bits 10, 22:19 and 23 are even more interesting
-
sech1
so much to test this evening
-
sech1
102B register - no idea, the changed bits (28:29) are not documented there
-
sech1
hmm, bit numbers that are changed are different from what's written here, so I guess Ryzen changed them all
-
sech1
But 0xC0011023 register is also interesting
-
cohcho
0xc001102b: 0x2008cc17 -> 0x1008cc16 or 0x2008cc17 -> 0x1808cc16 ?
-
cohcho
0x1808cc16 at your photo and 0x1008cc16 here in irc are different values
-
sech1
0x1808cc16
-
cohcho
and default ?
-
sech1
2008CC17
-
cohcho
more bits has changed then
-
sech1
bit 28 was set to 0, bits 26:27 were set to 1
-
sech1
and bit 0 was set to 0
-
sech1
no, bit 29 was set to 0, I'm bad at counting
-
cohcho
type binary of both values and check carefully
-
sech1
and bits 27:28 were set to 1
-
cohcho
You can bruteforce only changed bits of 0xc001102b to remove any redundant bits. It's likely only 2 of them shoud be changed.
-
sech1
-
sech1
Zen Perfboost :
bit.ly/2kBs15c (After loading CineBench R15 or Geekbench3 bench then run the boost, just click defaults button after benching to prevent bsod)
-
sech1
it also messes with MSR registers
-
kico
anyone knows HR for a 2950x ?
-
gingeropolous
im mining with gpus sech1 . cause im an idiot
-
sech1
not really, they're nice space heaters in this weather
-
sech1
-
sech1
18% hashrate increase
-
gingeropolous
with the 1 gb pages?
-
gingeropolous
nice
-
sech1
no, it's the magic MSR mod
-
sech1
turning off HW prefetchers
-
sech1
and 1GB pages, yes
-
tevador
yeah, memory starved systems will see a bigger jump in hashrate
-
kico
is it equivalent to disabling in BIOS ?
-
sech1
yes
-
kico
thanks
-
sech1
but my BIOS doesn't have this option
-
kico
oh ok I see :)
-
sech1
hmm, I'm getting 9614 h/s now (up from 9526 h/s)
-
sech1
just changed "Performance Bias" option on BIOS
-
sech1
interesting
-
sech1
I guess it's some other MSR register
-
sech1
how to dump all MSR registers?
-
cohcho
apply rdmsr to all values within [0xc0011000, 0xc00110ff] or even wider range, it should fail for unsupported addresses
-
cohcho
Since there are no docs about available registers
-
sech1
yep
-
sech1
I'll try all performance bias options and then dump the best one
-
sech1
"aida/geekbench" performance bias got me down to 7000 h/s :D
-
sech1
so 9614 h/s was the best, I'll dump registers and start comparing
-
cohcho
Do you have direct link to the firmware you're using?
-
sech1
-
sech1
"Performance bias = CBR15 gentle" was the best option
-
sech1
I think I've found it
-
sech1
another MSR register, I just need to retest it to confirm
-
sech1
sudo wrmsr -a 0xC0011021 0x40
-
sech1
9601 h/s
-
sech1
so I think some another register adds 13 h/s
-
sech1
sudo wrmsr -a 0xC0011020 0
-
sech1
this is it
-
sech1
9614 h/s
-
sech1
MSRC001_1020 Load-Store Configuration (LS_CFG)
-
sech1
bit 28: 1=Disable streaming store functionality.
-
sech1
MSRC001_1021 Instruction Cache Configuration
-
tevador
looking for that last +0.1%
-
tevador
:D
-
sech1
hey, it's +0.9%
-
cohcho
It isn't clear yet whether this is the last +0.1%
-
sech1
-
sech1
It's not for Ryzens, but register names there are what to look for
-
sech1
MSRC001_1xxx group
-
sech1
MSRC001_102D Load-Store Configuration 2
-
sech1
ForceSmcCheckFlwStDis 0=Force a
-
sech1
self modifying code check when a cache probe hits a store that has not retired
-
sech1
interesting
-
sech1
but probably won't give anything
-
cohcho
I've become a bit scary when saw all those configurable things inside cpu 1 month. Are you really sure that you current cpu configuration is within 10% from the best?
-
cohcho
1 month ago*
-
sech1
I'm sure that 9850 h/s is the limit for my 3700X @ 4.1 GHz
-
sech1
because this is where it's bottlenecked by instrution execution, not memory
-
sech1
I tested it at 2.05 GHz and got 4925 h/s, so this is how I calculated it
-
sech1
so it's within 2.5% from the best
-
cohcho
This approach test&dump is much faster than reading firmware asm.
-
sech1
damn, so much progress since even 1 month ago
-
sech1
1 month ago 8700 h/s was the limit for me
-
sech1
I'm getting consistent 600+ h/s per thread, beautiful :)
-
sech1
1200+ h/s per physical core on all cores
-
tevador
getting closer to the ideal CPU ASIC performance
-
sech1
yeap
-
sech1
this is the idea, right?
-
sech1
I guess I can set it to 3.6-3.8 GHz, undervolt and it'll not be bottlenecked by memory anymore
-
hyc
nice
-
sech1
I wonder what 3950X could do with all this tuning...
-
gingeropolous
9 bajillion h/s
-
sech1
18-19 kh/s is possible
-
tevador
yes, that's the goal, to squeeze the last bit of performance out of the CPU hardware
-
tevador
to make it harder for ASICs
-
sech1
9614 h/s, 156 W at the wall (but it's 51W when idle)
-
sech1
not very efficient, but I'll try to find more efficient speed/voltage tomorrow
-
tevador
energy-wise, the bitcoin network is roughly 360x more secure than Monero
-
tevador
based on the most efficient hardware
-
tevador
-
tevador
... botnets ... botnet mining ... more botnets ... battle against botnets ...