Forum Discussion

7 years ago

Apex Legends Crash no error - PC (apex_crash.txt)

So I just received a new crash from apex legends since reinstalling today, thank God that it had a apex_crash.txt Now can someone help me understand this poorly optimized crash fest of a game? Chee...

apex_crash.txt2 KB

Falkentyne

7 years ago

@OrioStorm

You said that it's doing a loop, right?
Aren't loops heavily cache intensive since it's pulling instructions repeatedly that have already been executed?

I did a test last night.

I set my 9900k to 5.2 ghz (Hyperthreading off), and I set my cpu voltage -extremely- high. 1.385v. High enough to be Prime95 AVX stable 1344K fixed FFT's.

(even did a 30 minute stress test, temps were reaching 90C but no problems, but not safe to test at those voltages).

That's what you would consider stable, right?

Nope. Apex crashed to desktop with no error.

Then I set it to 1.390v. Apex crashed with the "usual" 2DFA hex error (forgot exactly, but you know it) that many people are getting.

Ok something isn't right if I'm prime AVX 1344K stable but Apex is crashing.

So then I set it to 1.395v. This is getting in dangerous territory.

Guess what happened?

There were no crashes, but an "Internal Parity Error" was logged on CPU core #2 (going from cores 0 to 7).

So then I set the voltage down to 1.335v

Got a "memory can not be "read" error. (attached here).

I then increased CPU PLL Overvoltage (+mv)--(i don't even know what this setting does, has something to do with "clipping" the PLL so the CPU receives more than other devices that use the PLL) to +160mv. tested 1.335v again and got three Internal Parity Errors.

So if voltage isn't helping (Apex should NOT be crashing when Prime95 isn't crashing!) and you keep mentioning "loops", I realized it had to be SOMETHING else--possibly CACHE related.

Downclocking my RAM to 2133 mhz (3200 mhz CAS 14 gskill) didn't do anything.

So I increased VCCIO and VCCSA voltages to 1.25v and put the CPU voltage back at the unstable 1.335v.

And....

1 hour at 5.2 ghz HT off (1.335v): No crashes or parity errors.

So then I tried 5.1 ghz, HT on (1.340v). No errors.

So this means that the crashes are related to the L3 cache. Not the L2 cache. I know that the VCCIO voltage controls both the memory controller and the shared L3 cache. Since downclocking the RAM didn't help, it isn't the memory controller that is the issue. It also isn't the cache speed either, as I said my system earlier to 4.7 ghz core, 4.7 ghz cache, 1.230v (Loadline calibration=High), which would cause a clock watchdog timeout in Prime95 FMA3, but Apex ran just fine.

I'm guessing that at higher clockspeeds, the L3 cache needs to run faster because the core is running faster. So if Apex Legends is making heavy use of the cache (you mentioned looping instructions), then that means VCCIO needs to be increased, not CPU Vcore. I'm still testing this though. I think this can explain why other users are passing stress tests but Apex is crashing. As far as I know, many stress tests make HEAVY use of the L1 and L2 caches, which are directly tied to CPU Vcore. L3 cache isn't tied to CPU Vcore, but to VCCIO.

Stock VCCIO is 0.95v and stock VCCSA is 1.05v.

I think people can try increasing both VCCIO and VCCSA to 1.25v and see if their crashes stop.

apex_crash.txt2 KB

OrioStorm

EA Staff (Retired)

7 years ago

@Falkentyne, it is doing a loop, so the data will be coming through L3 cache. Some of the data never changes, and some is being produced in another thread.

However, the crashes are all related to instructions, not data. The instruction cache is 256 kiB on an i9-9900K, which is more than enough to hold this function. So normally the whole function will be in the instruction cache after the first iteration. This crash always occurs after a different number of iterations, but the fewest I've seen is still 137 iterations.

Now, that doesn't mean the problem CAN'T be the cache. Windows is a preempting OS, so it can switch to another program and/or migrate this thread to another core. If either of those happens, it effectively flushes the L1 instruction cache, so it has to refill, which probably will go through L3 cache.

On the other hand, if it was the cache and not related to the actual instruction sequence, I would expect this bug to show up throughout the executable. Instead, they're always in this one function. So even if L3 cache is a factor, it seems like it may just affect timing, which also appears to be a factor. Also, if it was cache related, I'd expect

OrioStorm
EA Staff (Retired)
7 years ago
Sorry, my phone scrolled to put the submit button where the keyboard used to be.
Anyway, I'd expect the function to tend to crash on the first time through the loop if it was L3 cache related, since that's when the instructions are coming through L3.
I really appreciate these experiments @Falkentyne, they'ree very helpful and informative!
Falkentyne
7 years ago
@OrioStorm
I don't know if this helps, but if the core voltage is far too low, Apex will sometimes throw out a WHEA logged correctable "CPU TLB" error
"Translation Lookaside Buffer" error. (the game doesn't crash at the same time this error happens however).
What would cause Apex Legends to throw out this error?
This is all I can find:
https://en.wikipedia.org/wiki/Translation_lookaside_buffer
I really wish i had learned programming. What Apex is doing is very interesting.
Everyone thought it was AVX instructions but it seems to be SSE2 or other things.