Forum Discussion
Hi, I'm an engineer at Respawn. Thank you for posting this log!
This particular crash is in MSVCRT, which is MicroSoft's Visual C++ RunTime dll. Microsoft's DLL crashed when Apex was trying to exit normally. This crash happened inside sapi_onecore, which is part of Microsoft's Speech API.
Unfortunately, there's not much we can do about this crash in 3rd party software.
It looks like it only happened when you were already exiting, so I hope that it doesn't negatively impact your experience enjoying Apex.
Hello, thank you for taking time to help out on the forum.
Right before my game crashes I see a big spike in Commit charge (see image) I thought the crash had something to do with memory because of this or would this be an effect of the crash?
Also, do any of you feel that the crashes has gotten more and more frequent with the last two patches? would this be because lifted fps cap?
- OrioStorm7 years ago
EA Staff (Retired)
@FatSpacePanda, I don't know for sure, but I speculate that the spike is probably when the OS comes in and starts to handle the crash.
- OrioStorm7 years ago
EA Staff (Retired)
I want to thank everybody in this thread for their help on this issue, especially @Falkentyne, @JorPorCorTTV, @MrDakk, and @TEZZ0FIN0.
Based on all the logs and investigation we've done, I'm now convinced that this is a flaw with Intel chips. There is some sequence of instructions that causes results to be used before they're ready on Intel chips. There is one function in Apex that executes this sequence. It doesn't have to be a complex, CPU-intensive program to expose flaws like this; they could even show up in Notepad! I know that we have other functions that are heavily optimized that don't crash; this function that crashes so much is hardly optimized because it's so simple.
It seems that overclocking Intel chips can cause this to happen pretty reliably. It also seems that this can happen when Intel SpeedStep technology raises the clock speed without actually overclocking at all. In all cases, it seems that lowering your clock speed causes the crashes in this code to stop.
So, I base my conclusion that it is an Intel hardware flaw on the experimental evidence that ⚽ the information about the CPU state that the OS reports for these crashes is always impossible for a properly functioning CPU, 🏈 these crashes are only reported on Intel chips, and 🏀 lowering the clock speed always fixes these crashes (even if it wasn't overclocked).
These crashes are exceedingly rare from a CPU perspective. If you crash every other game, that feels like a lot, because it is. My personal goal is nobody crashing ever! However, say that crashing every other game translates into crashing about once every 20 minutes. The CPU runs the instructions that crash at least 100,000 times a second when you're playing the game. With these conservative estimates, the CPU crashes about once every 120 million times it runs this code. So, even though it truly does crash a lot, the conservative estimate is that even a malfunctioning CPU actually functions properly in this code about 99.999999% of the time.
So, what next?
Well, I tried to isolate the crashing function to make a standalone program to exhibit the CPU bug. Unfortunately, I couldn't get the compiler to generate the exact same disassembly. Since the crash appears to depend on the actual sequence of instructions the CPU actually runs, if the disassembly is not equivalent, I don't think it's a good test. Even if I had identical instructions, I don't know that I could reproduce a data set that would cause this function to crash. The real data set depends on where you are in King's Canyon and which way you are looking. I can't replicate all the code and data to generate this data set in a standalone test program, so I'd have to generate random data and hope it crashes. But we've never seen this crash on any of our work machines, so I don't really have any way to verify that I've come up with a crashing data set.
All this to say, I've decided that it takes too much time to make a standalone program to try to repro this bug. The program has a low chance of succeeding, and I can't locally test whether it succeeds or not. Even if it we got lucky and it happened to work, then it would only help highly technical people hone-in their clock speeds, and it might help Intel fix their hardware flaw.
But, it wouldn't help everybody else who has an Intel chip who keeps crashing and doesn't post here and who doesn't want to mess with their clock speeds. I want to help them too! And as a consumer, I know that when you're crashing you don't really care whether it's the CPU's fault or the video card driver's fault or the game's fault; you just want the game to work.
So, I'm going to try changing the function that has all these impossible crashes to do the same job in a slightly different way, and hope that nudges it out of the "sweet spot" that causes some Intel chips to crash every few matches. Until we verify the fix, I'll leave in the old way as a hidden option, so we can be sure that we don't accidentally make things worse with no quick way to go back to the way things were. This is still a shot in the dark, because we can't repro the crashes on any of our machines, but it's the best shot I can take based on all the evidence in this thread.
Unfortunately, I don't know when these changes will go live through our regular release schedule.
- 7 years ago
@OrioStorm Thank you very much!
I knew it had to be a bug. I guessed (if you look at one of my posts earlier) i mentioned some sort of internal bug.
I guessed this because you remember, I mentioned "internal parity error" right?
That error is the "bug" happening and then being corrected by Error correction (ECC).
If you decrease the voltage even lower, you get a "Translation Lookaside buffer" error (instead of just internal parity errors), so these bugs are happening somewhere possibly in the TLB area (all CPU's have them), in an instruction register, but NOT in the "L0" cache (L0 cache is basically some sort of register also, almost like a bridge between the cores and L1 cache).
The only way we can fix this is to contact Intel.
They will need to release YET ANOTHER microcode patch which can fix this bug.
Can you please document your findings, if possible, and see if you can either contact Intel, call their 800 number, or post on one of the Linux / Debian developer threads so that this bug can be sent to Intel?
This is NOT in any way similar to the Pentium "DVID" bug F00F bug, as that was an instant 100% guaranteed crash, or maybe it's more similar to that Skylake FFT bug, where certain prime number FFT sizes would crash the processor. (this was fixed in a microcode update).I'm just a gamer so I have no access to Intel. But since you are a developer, you may be able to reach them. Reference the crash threads on these forums and hopefully this will be escalated to a "high priority" bug, since SOME users with stock CPU's are encountering this also.
These are the only links I can give you to help get this addressed by Intel
Their tollfree tech number of course (you will have to somehow reach their programming department, good luck with that)
https://www.intel.com/content/www/us/en/support/contact-support.html
(Intel):408 765-8080
(and some 800 number but there seems to be a business to business relations link only)
https://github.com/platomav/CPUMicrocodes
https://downloadcenter.intel.com/download/28727/Linux-Processor-Microcode-Data-File
*Edit* a number that works:
(916) 377-7000
- FatSpacePanda7 years agoNew Novice
Okay thank you. I first thought that the two were related since it was so much data (maybe the cpu got other instructions at the same time apex had something that otherwise would be a micro stutter ??) I don´t know, im way out of my depth.
- 7 years ago
@OrioStorm Thanks for detailed summary!
It also may be worth noting that you have increase the CPU voltage (vCore) to avoid the following non-crash error:
Event ID 19
WHEA-Logger
A corrected hardware error has occurred.
Reported by component: Processor Core
Error Source: Corrected Machine Check
Error Type: Internal parity error
Processor APIC ID: 0For me, a stable vCore was 1.28v for the I9-9900K, with vDroop on my motherboard this gave me a 1.25v-1.26v consistent vCore while gaming. If it drops to 1.21v-1.23v, I get the CPU parity errors above while I play.