Forum Discussion
@Ariuc Looks like you've done your homework.
I just want to make sure that other users who are not using higher tier hardware like you are, and who are not well versed on what loadline calibration actually does, don't just rush into pushing the 0 mOhms Loadline calibration (LLC level 8, Ultra Extreme, etc) into their CPU's and think "wow this is the best thing ever, why don't they all come like this?"
Raja, one of the engineers at Asus (when their forums still allowed private messages) taught me what the AC and DC loadlines (not to be confused with Loadline calibration, or LLC (also known as VRM loadline or just "Vcore Loadline Calibrationi)) did on auto/offset voltages, and he also explained the dangers of removing too much vdroop. Asus and other ODM's allowed such high levels of vdroop removal because "people kept begging them for it", so they gave the community what they wanted, even though they knew most end users had no idea how loadlines actually work (who here on this thread understood that Loadline Calibration is in mOhms (the same resistance value as the AC and DC loadlines?) and what the actual "levels" really meant? See?
For Gigabyte Z390 boards using 8 core processors, here are the resistance values for each level of Loadline Calibration:
Standard / Auto / Normal: 1.6 mOhms
Low: 1.3 mOhms
Medium: 1.0 mOhms
High: 0.8 mOhms
Turbo: 0.4 mOhms
Extreme: 0.2 mOhms
Ultra Extreme: 0.01 mOhms.
Vdroop = Amps * Resistance. Gigabyte boards have amps monitoring via the IR 35201 voltage controller (or the Intersil controllers) in HWINFO64 (Asus does not, sadly, even the eVGA Z390 dark doesn't seem to), so if you see your amps in IR 35201, multiply that by your LLC resistance, and then subtract that from your bios set voltage in millivolts. Example:
1.30v=1300mv, turbo LLC=0.4 mOhms. 100 amps load: 0.4 * 100 = 40. 40mv drop. 1300-40=1260mv=1.260v. Your VR VOUT should read 1.260v. That's your load voltage with 100 amps. Ignore the ITE 8688E sensor (Super I/O chip) and 8792E sensor (more accurate but won't show accurate vdroop).
It's also not Asus or Gigabyte's or other ODM's fault that transient spikes and drops happen. Asus doesn't make the VRM's. The companies like DrMos, Osemi(?), International Rectifier, Renesas (Intersil), etc, make them. You have the voltage controller itself and then the actual power mosfets. But these VRM's are not spec'd to run with a 0 mOhm loadline at all. Even the official datasheets show the loadline information and none have a 0 mOhm loadline as part of the specifications. So that's something worth noting.
I assume since you're a gamer, you probably have discord. You can add me if you wish with the tag at the of my exact same username: #6092
One way you can test for transient spikes is to use the hardest stress test possible, in this case, Prime95 with FMA3 or AVX (I like testing 15K in place fixed AVX or 21K or 22K fixed, or even FMA3), and find the VMIN first, which takes a lot of work (basically, to save time, the absolute minimum voltage you can set in bios, *WITH* LLC5 or LLC6 vdroop), to not crash a prime thread at load in 15 minutes.
https://mersenneforum.org/showthread.php?t=24094
This version of prime95 allows you to enable or disable AVX and FMA3 right in the stress test without editing TXT files (recommend anyone who enjoys torturing their CPU's grab it), but yeah, for exammple, do a prime test with FMA3 enabled, 15K fixed in place FFT (custom, min and max range 15K), with let's say, 1.30v and LLC6 and start there. If you INSTANTLY reach 100C and BSOD, disable AVX2 (FMA3) and just do AVX instead. If you pass 15 minutes, reduce voltage 10mv, reboot and try again until it crashes. Then record the **LOAD** VOLTAGE (not bios voltage!) that it took to pass. Your motherboard has a good bios which reads CPU voltage properly at load, as does Gigabyte Z390 boards (VR VOUT sensor in HWinfo, for anyone reading this!) and some MSI boards have VR VOUT. For non Asus WS / Maximus Z390 boards, if you have VR VOUT access in HWInfo64, use that for load voltage, guys.
Anyway record this voltage. So in your case if you set 1.30v in bios, LLC6, it might be 1.225v at full load on your Asus WS. If that's your stable prime 15K AVX FFT (29.8 builid 3) test, now switch to Ultra Extreme.
And set your bios voltage to 1.230v. LLC8 or Gigabyte Ultra Extreme. Yes, 1.230v. Because that was your load voltage remember? (usually controllers go in 5mv steps so im giving a cushion for accuracy).
Then run LLC8 + 1.230v. Do prime95 AVX 15K in place fixed FFT (custom).--min and max range 15K.
Your system won't last long. (If 1.230v were your actual load VMIN). You will either BSOD or a thread will crash in the first 5 minutes. That's because of transient drops. Even though it looks like Prime95 is putting a constant full even load on your CPU, the load actually changes very often, too fast to register, thus transient drop=crash.
Feel free to message me on discord (remember, my username with #6092 at the end, Falkentyne) if you want to mess with this XD
And now that I derailed this thread talking about VRM's....back to your standard programming!
crash:
{
!!!unknown-module!!!: 00007FF7457C4823
EXCEPTION_ACCESS_VIOLATION(execute): 00007FF7457C4823
R5Apex: 00000000003147D1
}
cpu: "Intel(R) Xeon(R) CPU X5650 @ 2.67GHz"
ram: 16 // GB
callstack:
{
KERNELBASE: 000000000008667C
ntdll: 00000000000A810B
ntdll: 000000000008FD56
ntdll: 00000000000A46AF
ntdll: 0000000000004BEF
ntdll: 00000000000A341E
!!!unknown-module!!!: 00007FF7457C4823
}
registers:
{
rax = 5
rbx = 0x0000005AF140F680
rcx = 0x00007FF671F10000
rdx = 0x0000005AF140F680
rsp = 0x0000005AF140F5A8
rbp = 0x0000005AF140F6B0
rsi = 0x0000005AF140F760
rdi = 5
r8 = 1
r9 = 0
r10 = 0x00007FF7457C4823
r11 = 0x0000005AF140F760
r12 = 0x0000005AF140F7B0
r13 = 0x000001D6D1BE7220
r14 = 0x000001D75266FDE0
r15 = 0x00007FF69AFFB7C8
rip = 0x00007FF7457C4823
xmm0 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm1 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm2 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm3 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm4 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm5 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm6 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm7 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm8 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm9 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm10 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm11 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm12 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm13 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm14 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
xmm15 = [ [0, 0, 0, 0], [0, 0, 0, 0] ]
}
build_id: 1554860081
- 7 years ago
Still no ETA on patches? 😢
- OrioStorm7 years ago
EA Staff (Retired)
@RascabuchesCH, the patch is working its way through the release pipeline. Hopefully it won't be much longer.
- 7 years ago
@OrioStorm I just got the weirdest error ever.
Check this one out (an "execute" error).
It said something like the CPU memory controller didn't want to execute one address while the CPU wanted (or didnt want) to execute another.
- OrioStorm7 years ago
EA Staff (Retired)
@Falkentyne, that's another bug that tended to happen in the same function. The crash referenced these two memory addresses:
0x00007FF79DD561B3 -- this is the memory address of the next instruction to execute
0x00007FF78CD061B3 -- this is the memory address that it claimed it did not have permission to execute
You'll notice that those numbers are very similar, except some of the lower 32 bits that are set in the top number are clear in the bottom number. For example, look at the first place the two numbers differ. 9D in hex is 1001_1101 in binary. 8C in hex is 1000_1100 in binary. The least significant bit goes from 1 to 0 in each of those digits. Later, 5 becomes 0, which goes from 0101 to 0000 in binary. (You can put your Windows Calculator into Programmer mode and let it do the conversions between decimal / hex / binary for you if you want to see for yourself.)
So, this is another example of a CPU not behaving properly. The instruction that the CPU wants to execute is the instruction at RIP, by definition. (RIP = Instruction Pointer Register. In 16-bit days it was just IP, then in 32-bit days it became EIP for Extended Instruction Pointer, and in 64-bit days it's RIP.) But it crashed saying it didn't have permission to execute an instruction NOT at RIP, which is a memory location it shouldn't have wanted to execute in the first place.
- OrioStorm7 years ago
EA Staff (Retired)
The patch that includes the work around for the Intel CPU crashes went live at 10 AM Pacific time today. It is version 1.1.3.
I'd be very interested in learning whether this fixes the crashes for everybody, even without lowering CPU speed.
- 7 years ago
had a crash on every game played since the release of the patch, first was a freeze with no crash report, second the program closed itself after a match, third generated a crash report, 2 other reports enclosed taken from today, have re-installed countless times, re-installed windows as well as other work arounds. hopefully get fixed soon.
- OrioStorm7 years ago
EA Staff (Retired)
@Ruon-21, the first two crashes are from the previous patch.
The first crash is in our loading code; it is populating memory, but the OS says it doesn't have permission to write to that memory. I don't know why it would crash this one time and work every other time; I'll have to see if there are any further clues in the crash info.
The second crash from the prior patch is inside the Windows OS called from inside DX11 called from when Apex was freeing a texture that it no longer needed in memory. Unfortunately, I can't really figure out anything more from this crash since it happened so deep inside 3rd party code.
The last one is from 1.1.3, but it is in an unknown DLL. Basically, we asked Windows what the DLL name was, and it said "I don't know". I can tell that this is not a thread that Apex creates, because there are no entries in "R5apex" at the bottom between "KERNEL32" and "module@...". The "KERNEL32" line is when the OS creates the thread that crashed, and the "module@..." line is the entry point in whatever 3rd party DLL it was that crashed for you. Unfortunately, if Windows can't tell us the name of the DLL that crashed, we can't tell you how to avoid the crash.
- 7 years ago
Thank you for the feedback.
That crash I mentioned the other day that you commented on happened at 5.1 ghz @ 1.360v bios setting (loadline calibration =Turbo, so load voltage was probably about 1.340v).
I did another run after that crash at the same voltage that day (same windows session) and got an "CPU Internal Parity Error" (WHEA Correctable).
Not 100% sure this is the exact timestamp of that error, but this happened *after* the crash I had posted.
A corrected hardware error has occurred.
Reported by component: Processor Core
Error Source: Corrected Machine Check
Error Type: Internal parity error
Processor APIC ID: 10The details view of this entry contains further information.
so whatever Intel bug that was being generated was changing bits somewhere, and if it was "corrected", the game would continue playing and a WHEA would be logged. If it wasn't corrected, then you get one of those two 2F2DA breakpoints or something like as you mentioned.
With the new patch out:
I *REDUCED* the cpu core voltage at 5.1 ghz from 1.360v to 1.335v.
Been playing 1 hour 40 minutes.ZERO crashes so far.
ZERO WHEA "Internal Parity Errors" generated.
So far it's been an improvement but I've seen "long sessions" in the past without WHEA parity correctable errors, then suddenly the game crashed when I got my hopes up.
But this is an improvement. I'll keep playing and keep testing the results. But preliminary answer is: "Good so far!"
Is Intel acknowledging this bug in their firmware? I know how stubborn Intel can be...
I know how to get new microcode ahead of time (before it's put into bioses), that's easy: Just download VMWare microcode updater and the microcodes as discussed here:
https://www.win-raid.com/t3355f47-Intel-AMD-amp-VIA-CPU-Microcode-Repositories-Discussion.html
and then the DAT to bin converter over here:
https://onedrive.live.com/?authkey=!AE_9xt1wnaLT5lk&id=11F4002E1134F403!617750&cid=11F4002E1134F403
Then you can download the microcode patcher itself (that has a current DAT file conversion already) from here:
https://mega.nz/#!gZBzBIib!wNZwqhegXl1FME7h5HLhsfAT55Xk_EyTN6QNBo7l6Qo
For new microcodes that are not in "Microcode.dat", you can convert the bin files from the archives to a new DAT file, then patch it with the mega microcode vmrware updater, after you rename your new dat file to "microcode.dat" and copy it into the cpucodeupdater folder manually.
Anyway tl;dr: it's a definite improvement *so far*. I'll keep playing all day until I find how stable it is.
- 7 years ago
@OrioStormthanks for the super fast reply man, kudos! ill post any more crash reports I get, have since repaired the game and not had a crash since, but its so unpredictable I'm sure more will occur
- 7 years ago
A crash... Again.....
- OrioStorm7 years ago
EA Staff (Retired)
@POF_Wyjakx, the CPU reported an illegal instruction at an address that has a legal instruction. This is in code that runs every single frame, where the instructions aren't changing. This shouldn't be possible if everything is working right, so the most likely causes are hardware issues, such as overclocking or memory that's slowly dying.
Are you overclocking your CPU by chance?
- 7 years ago
This is the error I get. Nothing fixes it. I have tried everything and it had gone away and now it came back again!!!