Forum Discussion
As more general info, the crash is happening inside a loop to decide what LOD to use for a bunch of models, and whether they should fade out. Looking at @Benef1cient's apex_crash.txt, I see that iteration 883 crashed processing model 96651. Looking at @TEZZ0FIN0's crash, I see that iteration 450 crashed processing model 24032. Looking at @Vaestroo's crash, he made it to iteration 3768 and model 408479.
There's no way through the loop that doesn't hit the instruction that crashes, so we've had hundreds or even thousands of iterations that didn't crash before we hit one that did.
http://oi63.tinypic.com/2wret5g.jpg
This is certainly interesting you mention this from the crash log.
- OrioStorm7 years ago
EA Staff (Retired)
The code I mentioned is actually LOD for the stationary models, not the characters that move around.
I think the bug in your screenshot is something else. We store lighting at various points in space. We use the closest sample point to light a character. Sometimes you're outside on a roof and the closest point is inside near a dark ceiling. We work hard to avoid this in how we automatically place the sample points, but it isn't perfect.
- 7 years ago
I want to thank you GREATLY for the time that you've spent here. It's VERY rare these days for programmers and developers to engage with gamers like you are doing, and this is very appreciated. Back during the glory days of Usenet (before the flight sim and space sim genres crashed), this was a lot more common. 3dfx devs posting in the 3dfx groups, flight sim pilots and programmers posting, and all of this before the Ultima IX Fiasco too...
In my case, all crashes are purely due to overclocking. And yes, I found out that if I ever get a random CTD (no error or log) or exception breakpoint (log), if I keep playing, eventually I will get a "Internal Parity Error" on any one of the 16 threads (8 physical cores), although it can sometimes take hours to happen. Once I increase vcore to a point or lower clockspeed to where these "Internal parity errors" NEVER happen, the game never crashes, ever.
Is it possible that the "Anticheat" is putting extra load on the CPU? Because I've never seen a game which will cause errors like this if there is is any sort of instability anywhere, not even Battlefield 5. Apex Legends is indeed the gold standard for testing CPU stability.
That being said, the people who are crashing at *STOCK* speeds concerns me greatly, as that should NOT be happening. Intel validates their chips to run up to 100C at the max single core turbo boost frequency and the max 4, 6 or 8 core turbo boost for that SKU. They should never crash, even with a power virus like Prime95 small FFT FMA3.
I want to ask EVERYONE here who is crashing at STOCK SPEEDS (NOT OVERCLOCKED):
1) Are you using windows version 1809 or newer?
2) is your CPU a 7700K or newer?
3) Are you using SPECTRE AND MELTDOWN mitigations enabled?
4) Is your Bios updated and your CPU microcode current?
If the answer to all four is YES, can you please do the following?
Please go here, download this:
https://www.grc.com/inspectre.htm
then disable all protections.
Then run the game at your stock CPU settings and see if you crash anymore.
Please report back. This is important.
There were reports of early mitigation protections causing internal parity errors on several intel SKU's *even at idle*, and It may be possible you guys are running into a microcode bug which is making the game crash
The easiest way to determine if Intel's meltdown protections are making the game crash are to simply disable them.
This has NO effect if you are using a windows version which is not protection aware (like 1709 or older).
tl;dr: microcode bugs can cause strange problems. Intel has *PULLED* early microcodes for Spectre protection in the past because in SOME CASES they were actually *DESTROYING* CPU's--yes--DESTROYING them. (Prema, who is a bios modder over on notebookreview's forums, encountered this when testing early protection microcode--that's why some of you may remember those "rolled back" intel microcode/bios patches and windows updates...
- 7 years ago
@OrioStorm I have a huge folder of Apex crash files as I continue to crash all of the time. Its the only game that crashes on my computer. Here is a selection of them:
PC Specs:9900K
2080ti
32GB DDR4 3200mhz
Windows 10 Pro
Installed on Samsung 970 EVO nvme SSD
- 7 years ago
@OrioStorm If you need more, I can post more.
- 7 years ago
I just wanna say that I didnt get a single crash after changing back to default and just tweaking a few settings.
For the most part I was playing with V-sync off for no input lag or I was trying with adaptive 1/2.
With V-sync off and adaptive 1/2 the game was crashing very frequently.
I always thought crashing was very weird because the friends im playing with never had crashes but I did, so whats the deal..
They would ask me like "seriously, now again??", so I went back to default which they are using too, and now its not crashing frequently anymore.
I would play with low settings and v-sync off for high FPS and minimal input lag.
I went back to default and havent touched much with lightning or LOD in settings.
Now I can play for hours with no problem, as otherwise it would crash in almost every match..
if this is a "temporary fix" I think its a big breakthrough!
edit: I think the key here might actually be Triple Buffered for some reason..
My friends with default settings are also ofc all using this too..
- 7 years ago
I just wanna say that I didnt get a single crash after changing back to default and just tweaking a few settings.
For the most part I was playing with V-sync off for no input lag or I was trying with adaptive 1/2.
With V-sync off and adaptive 1/2 the game was crashing very frequently.
I always thought crashing was very weird because the friends im playing with never had crashes but I did, so whats the deal..
They would ask me like "seriously, now again??", so I went back to default which they are using too, and now its not crashing frequently anymore.
I would play with low settings and v-sync off for high FPS and minimal input lag. I went back to default and havent touched much with lightning or LOD in settings.
Now I can play for hours with no problem, as otherwise it would crash in almost every match..
if this is a "temporary fix" I think its a big breakthrough!
I think the key here might actually be Triple Buffered for some reason.. My friends with default settings are also ofc all using this too..
And they never had these "technical issue" crashes.
- OrioStorm7 years ago
EA Staff (Retired)
@Lunchb0x88, thanks for those crashes. I quickly went through them. They're all a little different from each other, but they're all in the same code function, and none of them should be possible with a properly functioning CPU.
4 out of the 9 are crashing at 2F2DCA. Of those, 1 says it hit a breakpoint, and 3 say they tried to write to address FFFFFFFFFFFFFF8C. The instruction there doesn't want to do either of those things.
3 out of the 9 are crashing at 2F2E2A. This is the most plausible crash; it might be explained by stack corruption. However, I've verified that this function can't corrupt the stack itself, and it only calls one other function. If that other function corrupted the stack, how did it not crash before returning here? That function is called from dozens if not hundreds of places. If it's corrupting the stack, why does it only crash here, and not any of those other places? So even this crash doesn't make sense.
1 out of the 9 is crashing at 2F2E9E while trying to write to memory address 1. The code never wants to execute an instruction at 2F2E9E, because that's in the middle of the instruction that starts at 2F2E99. The instruction at 2F2E99 is 8 bytes long, so 2F2E9E is 5 bytes into that instruction. If it did want to execute the middle of that instruction, it would crash this way with the registers in the log. But it should never want to crash this way.
1 out of the 9 is crashing at offset 2F20BB, loaded at address 0x00007FF6AC3520BB, while trying to execute memory at 0x00007FF600000000. Note that the instruction it's trying to execute is not the instruction it should be trying to execute; it is almost the right address, but the last half of the address has been forced to all zero.
This brings up an interesting point. The crash a 2F20BB was reported from the previous patch; all the other crashes are from the most recent patch. I saw the 2F20BB crash a bunch of times on the previous patch; I haven't seen it since then. On the other hand, all the other crashes people are getting with the latest patch are new; I never saw them on the previous patch.
Now, this function did change in this patch, but in a trivial way. If you know C++, all we did was we took a function that was an inline member function inside a struct, and we turned it into an inline function at global scope. The compiler should generate equivalent code for this change, but that's not guaranteed, and I haven't yet diffed the disassembly of the two patches to see whether it did.
So, our change shouldn't alter this function's behavior at all. And it didn't stop this particular function from crashing, but the nature of the crashes completely changed. Maybe the change to crash behavior is because the function is at a different offset in memory now? Now I'll have to go look at the old patch's disassembly and compare.
- OrioStorm7 years ago
EA Staff (Retired)
@TEZZ0FIN0, do you know whether your friends have the same CPU? This crash appears to be Intel-specific, since none of the reports I've seen have been on an AMD CPU.
- OrioStorm7 years ago
EA Staff (Retired)
@eXe_NIBIRU, that crash is the one where the CPU decides to execute an instruction starting in the middle instead of at the beginning (like one that @Lunchb0x88 just reported). A properly functioning CPU shouldn't be able to do that.
I'm not convinced that this problem is limited to overclocking, but it might be an exacerbating factor.
Overclocking can cause faults even without overheating. If you're overclocking, you're reducing the time the CPU is allowed to take for each step of each instruction, but the transistors still toggle at the same speed as they execute each stage. If the clock is ever shorter than a step of an instruction, the next step of the instruction can start before the previous step's results are done.
As an analogy, imagine you're doing a 10-digit addition problem, and somebody else is going to use your answer. Normally, they give you enough time, and you get all 10 digits before they look at it. However, if they're in a hurry (as in overclocked), they may look at your answer when you've only written down the first 8 digits. They don't see the problem you're working on, only your answer, so they use those 8 digits as if it was the correct answer, when it's not - it's just part of the answer. So anything they do with that number is likely to be wrong.
Overclocking and overheating can both cause rare CPU errors. You can get errors from overheating without overclocking, and you can get errors from overclocking without overheating. If you're both overclocking AND overheating, that's really pressing your luck!
Again, I don't know whether this is due to overclocking or not, but I do know that none of these crash reports seem possible with a properly functioning CPU.
- 7 years ago
I absolutely need everyone who is crashing at stock CPU speeds to disable Spectre and Meltdown protection as per my above post last page, and see if that helps stability.
If there is a bug in the Intel microcode, Intel needs to hear about it. (This is only for current updated versions of windows). I'm still seeing people post crash reports and 0 people attempting to do what I suggested just in case. As I said, Intel pulled older Kaby/Coffee series microcodes before because they were actually destroying chips...(9900k and 9700K were not released yet).
- 7 years ago
@OrioStormactually forget everything I said about Triple Buffering, it didnt work, it just didnt crash for a long time.
However I made an interesting discovery.
I have i7 7700k with a noctua cooler (runs very cool) and MSI 1080 ti sea hawk, these were top shelf parts couple of years ago.
Now I did have some overclocking on the CPU, not much increase but from 4,2 ghz to 4,8 ghz.
When I read here about the CPU problems, I reset my bios settings to default so it no longer runs an overclock.
Also I disabled Intel Speedstep technology in BIOS.
I played all night without any crash now, longer than this Triple Buffering Theory I had previously. I can even play with V-sync off and whatever setting I like.
Im pretty sure the game is fixed for me now because usually it would crash like an amount of times every hour.
This time I played for many many hours (all night actually) without even a single crash..
so I recommend everyone here if they can, put your CPU speed back to default and disable Intel SpeedstepTM.
I will keep on playing all day tomorrow to ensure this in fact had something to do with the crashes.
Like I said with oc before and Intel SpeedstepTM on it would crash almost every match for me.
Now thats all gone it seems.
- 7 years ago
Tried disabling the meltdown/spectre fixes just to test and didn't change anything. Here is another crash log. Starting to crash a lot without a crash log which is getting quite annoying. Reset my bios, tried everything I could think of. My PC is brand new and I have never had an issue in other games. I even did a fresh install of Windows 10 with the latest 1809 update. It clearly has to be triggering something on the higher end Intel processors. Mainly the 7700k and 8700k. Haven't got the typical EXCEPTION_BREAKPOINT crash yet though.
- 7 years ago
Is anyone running iCue software by any chance? If you are, can you try turning it off to see if helps?
- 7 years ago
Thanks for the answer.
The fact is that my CPU is all new, I installed it like 3/4 weeks ago, and i'm 100% sure that it's not overheating and I never had the bug before in 300/400hrs played with the same processor overcloacked, only had it one time just when the update 1.1.1 came out. But maybe the overcloack is causing trouble, yeah.
I know what you mean, actually I am a system administrator. However I only experienced the crash one time. After a clean reboot I was able to play for 4 hours without any crashes. As you said it must be rare.
Is it also possible that the autoexec command " cl_forcepreload " had a role in this ? Mine is to 0 at the moment (till 2 days.) It was 1 before I experience this crash
Do you recommend a value for this command ? - 7 years ago
I have an idea what is causing these crashes.
Are you using the CPU's L3 cache in a nonstandard way?
I made two changes last night which fixed all crashes on an overclocked 9900K, AND which removed all "Internal Parity Errors", but I still need to test this again today (it's 3 AM right now) to confirm, but there were no errors in two hours of testing. But I have to make 100% sure before I jump the gun.
And no, it was NOT CPU Vcore at all, and it might also explain why some people fixed the issue by using an AVX offset (Apex Legends does not use AVX).
- OrioStorm7 years ago
EA Staff (Retired)
@eXe_NIBIRU, the game doesn't use or even create "cl_forcepreload", so my only recommendation is to not set it.
- OrioStorm7 years ago
EA Staff (Retired)
@TEZZ0FIN0, thanks for the experimental results. It's interesting that disabling speedstep seems to have stopped the issue for you. You shouldn't HAVE to disable a CPU feature to get the CPU to work. If the CPU says it will go up to a certain frequency as long as the temperature stays in range, then that frequency should work.
@Falkentyne, the function that is crashing is really quite boring. It's just doing some simple math in a loop. There's nothing that stands out as strange or tricky compared to any other part of the code. If there was anything remotely suspicious in this code, I'd probably change it just to see if "stirring the pot" caused the problem to go away.
Which brings up another interesting point. This function is templated with two nearly-identical versions (one for shadows and one for cameras). The crash has only ever happened in the one for cameras. Why doesn't the other one ever crash? I'll have to compare their disassembly soon...
- 7 years ago
You said that it's doing a loop, right?
Aren't loops heavily cache intensive since it's pulling instructions repeatedly that have already been executed?I did a test last night.
I set my 9900k to 5.2 ghz (Hyperthreading off), and I set my cpu voltage -extremely- high. 1.385v. High enough to be Prime95 AVX stable 1344K fixed FFT's.
(even did a 30 minute stress test, temps were reaching 90C but no problems, but not safe to test at those voltages).
That's what you would consider stable, right?
Nope. Apex crashed to desktop with no error.
Then I set it to 1.390v. Apex crashed with the "usual" 2DFA hex error (forgot exactly, but you know it) that many people are getting.
Ok something isn't right if I'm prime AVX 1344K stable but Apex is crashing.
So then I set it to 1.395v. This is getting in dangerous territory.
Guess what happened?
There were no crashes, but an "Internal Parity Error" was logged on CPU core #2 (going from cores 0 to 7).
So then I set the voltage down to 1.335v
Got a "memory can not be "read" error. (attached here).
I then increased CPU PLL Overvoltage (+mv)--(i don't even know what this setting does, has something to do with "clipping" the PLL so the CPU receives more than other devices that use the PLL) to +160mv. tested 1.335v again and got three Internal Parity Errors.
So if voltage isn't helping (Apex should NOT be crashing when Prime95 isn't crashing!) and you keep mentioning "loops", I realized it had to be SOMETHING else--possibly CACHE related.
Downclocking my RAM to 2133 mhz (3200 mhz CAS 14 gskill) didn't do anything.
So I increased VCCIO and VCCSA voltages to 1.25v and put the CPU voltage back at the unstable 1.335v.
And....
1 hour at 5.2 ghz HT off (1.335v): No crashes or parity errors.
So then I tried 5.1 ghz, HT on (1.340v). No errors.
So this means that the crashes are related to the L3 cache. Not the L2 cache. I know that the VCCIO voltage controls both the memory controller and the shared L3 cache. Since downclocking the RAM didn't help, it isn't the memory controller that is the issue. It also isn't the cache speed either, as I said my system earlier to 4.7 ghz core, 4.7 ghz cache, 1.230v (Loadline calibration=High), which would cause a clock watchdog timeout in Prime95 FMA3, but Apex ran just fine.
I'm guessing that at higher clockspeeds, the L3 cache needs to run faster because the core is running faster. So if Apex Legends is making heavy use of the cache (you mentioned looping instructions), then that means VCCIO needs to be increased, not CPU Vcore. I'm still testing this though. I think this can explain why other users are passing stress tests but Apex is crashing. As far as I know, many stress tests make HEAVY use of the L1 and L2 caches, which are directly tied to CPU Vcore. L3 cache isn't tied to CPU Vcore, but to VCCIO.
Stock VCCIO is 0.95v and stock VCCSA is 1.05v.
I think people can try increasing both VCCIO and VCCSA to 1.25v and see if their crashes stop.
- OrioStorm7 years ago
EA Staff (Retired)
I looked at the diff of the disassembly of the shadow and camera versions of this templated function. For most of the function, the only difference in the disassembly is what registers the compiler assigned to various things. However, the shadow version always uses a specific lod, so it skips a bunch of scalar floating point math (which is done on SSE hardware).
Also, the shadow version is actually only used for the really distant shadows, which only do partial updates when things like the drop ship are moving. Many frames we don't update them at all. Plus, the nature of the incremental updates means that it processes far fewer models. So, the shadow version does a lot less work, and does it less often. That reduced workload may be enough to explain why it never seems to crash.
- OrioStorm7 years ago
EA Staff (Retired)
@Falkentyne, it is doing a loop, so the data will be coming through L3 cache. Some of the data never changes, and some is being produced in another thread.
However, the crashes are all related to instructions, not data. The instruction cache is 256 kiB on an i9-9900K, which is more than enough to hold this function. So normally the whole function will be in the instruction cache after the first iteration. This crash always occurs after a different number of iterations, but the fewest I've seen is still 137 iterations.
Now, that doesn't mean the problem CAN'T be the cache. Windows is a preempting OS, so it can switch to another program and/or migrate this thread to another core. If either of those happens, it effectively flushes the L1 instruction cache, so it has to refill, which probably will go through L3 cache.
On the other hand, if it was the cache and not related to the actual instruction sequence, I would expect this bug to show up throughout the executable. Instead, they're always in this one function. So even if L3 cache is a factor, it seems like it may just affect timing, which also appears to be a factor. Also, if it was cache related, I'd expect
- OrioStorm7 years ago
EA Staff (Retired)
Sorry, my phone scrolled to put the submit button where the keyboard used to be.
Anyway, I'd expect the function to tend to crash on the first time through the loop if it was L3 cache related, since that's when the instructions are coming through L3.
I really appreciate these experiments @Falkentyne, they'ree very helpful and informative!
- 7 years ago
I don't know if this helps, but if the core voltage is far too low, Apex will sometimes throw out a WHEA logged correctable "CPU TLB" error
"Translation Lookaside Buffer" error. (the game doesn't crash at the same time this error happens however).
What would cause Apex Legends to throw out this error?
This is all I can find:
https://en.wikipedia.org/wiki/Translation_lookaside_buffer
I really wish i had learned programming. What Apex is doing is very interesting.
Everyone thought it was AVX instructions but it seems to be SSE2 or other things.
- OrioStorm7 years ago
EA Staff (Retired)
@Falkentyne, this crash is interesting. We released 1.1.1 yesterday, and this crash is with that newer executable. Looking at the disassembly, this function is identical between 1.1.0 and 1.1.1; not even the registers got shuffled, and it's even at the same offset in the executable.
However, the crash is a new crash. We haven't seen that offset before with 1.1.0.
Also, the address implicated in the crash dump is not the start of an instruction. The instruction starts one byte earlier.
The incorrect instruction at 2F2DD9 actually just skips a prefix that turns a vector SSE instruction into a scalar SSE instruction. In this case, it would be mulss if it started at 2F2DD8, but because it skipped the first byte of the instruction, it turns into mulps. So it will do a vector multiply instead of a scalar multiply. When it does that, it requires 16-byte alignment instead of 4-byte alignment for memory reads. The address it's reading has 4-byte alignment but not 16, so doing mulps will crash due to unaligned access, whereas mulss (which we wanted) would work.
For reasons I don't know, I've always seen unaligned memory accesses in SSE instructions get reported as read or write access violations of memory location FFFFFFFFFFFFFFFF. It doesn't matter what memory it tried to read or write, it always reports it at that other address.
So it looks like the instruction pointer got off-by-one in this crash, which ended up causing an unaligned memory read of a valid address, which gets reported as a memory read of an invalid address. But the hardware bug was the off-by-one in the instruction pointer register, RIP.
- 7 years ago
@OrioStorm Attached my crash log. Running a i9-9900K @5.0GHz
Voltages:
VCore - 1.35v
VCCIO - 1.328v
VCCSA - 1.264v
So I don't think voltages are the issue as @Falkentyne suspects.
Have an ASUS Maximus XI Hero motherboard, will try stock BIOS settings tonight.
About Apex Legends Technical Issues
Community Highlights
- EA_Blueberry7 years ago
Community Manager
Recent Discussions
- 2 hours ago
Nintendo switch linking
Solved4 hours ago- 4 hours ago