High GPU stress causing black screen crash

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • RufusInPhilly
    PCHF Member
    • Feb 2022
    • 15

    #1

    High GPU stress causing black screen crash

    I’ve been trying to troubleshoot this issue on my own, but I’m at my wits’ end and hoping someone here can help. Fingers crossed!

    I built my rig about 9 months ago, and things have mostly gone smoothly … except for very occasional crashes to a black screen when running high-performance games. In the past, I’d uninstall & update drivers, physically reinstall the GPU, reboot, and things would mysteriously seem to go away for a while. But in the last week, this happens more and more, and now nothing seems to fix it.

    Most recently, I’ve removed the GPU, completely uninstalled the drivers with DDU, reinstalled the card, updated to the current Nvidia drivers, and run Windows Update.

    I can reproduce the issue by either running high-end games (Cyberpunk, Red Dead 2, AC Odyessy) or by stressing the GPU with 3DMark or FurMark. When this happens, the screen goes black, all the fans ramp up, and everything freezes until I hard reboot, at which point I’m the system will run fine … until the graphics are stressed again.

    Other things to note:

    I’ve run HWiNFO64 to look at temps, and nothing has ever jumped off the scales; it’s a well-ventilated system. I’ve been very meticulous in building the machine; everything is connected well, it’s clean, dust-free, and every component was brand new when I put it together.

    My first thought was that this could be a power issue, but I think my 850-watt gold PSU is enough, especially since I’m actually not overclocking my i9-10850K at the moment. (Maybe not though?)

    Looking at the Event Viewer, it seems the error “Faulting application name: dwm.exe…” occurs several times right before every crash. Googling that error message lead me here to this forum. So here I am! Does anyone have any ideas? Components are listed below along with a Speccy snapshot and MiniToolBar file from right after the last crash. Thanks in advance!

    i9-10850K
    ZOTAC GeForce RTX 3070
    ASUS Prime Z490-A
    Corsair Vengeance RGB PRO - 32GB DDR4 3600
    Corsair RMX Series, 850 Watt, 80+ Gold

    Samsung 970 Evo - 500GB M.2/PCIe (for the OS)
    Mushkin Enhanced Pilot-E - 2TB 2280 M.2/PCIe
    Intel 660p M.2 2280 - 1TB PCIe 3.0 x4
    Crucial BX500 - 2TB 3D SATA

    Cooler Master MasterLiquid ML240L RGB v2
    Cooler Master SickleFlow 120mm V2 ARGB

  • phillpower2
    PCHF Administrator
    • Sep 2016
    • 15205

    #2
    As a starting point;

    You have the wrong RAM for your CPU, Intel state here up to 2933MHz and if you have XMP enabled the RAM will get auto OCd past what the CPU can handle and the PC start to have issues.

    Go into the BIOS, disable XMP and then manually set the RAM to run at 2933MHz and the voltage to 1.35V.

    Power Profile
    Active power scheme:[COLOR=rgb(226, 80, 65)] High performance

    Change the Windows Power Plan to Balanced, Ultra and High Performance are a form of overclocking that is known to cause stability and overheating issues, the setting should only be used for gaming type notebooks that have a discrete GPU that needs the extra power.

    Make the above changes, fully shut down the computer, restart and test by using the computer as you normally would then post back with an update for us.[/COLOR]

    Comment

    • RufusInPhilly
      PCHF Member
      • Feb 2022
      • 15

      #3
      phillpower2, thank you sooo much! I cannot believe I missed this somehow when I was picking the parts for my build.

      And yes, I believe I did enable XMP when I originally set things up because I wanted to get the speeds up to 3600mhz (even though that’s too fast for my mobo, apparently). I’m going to do both of these suggestions this evening and run a few most stress tests, and I’ll report back afterward. Thank you again!

      Comment

      • phillpower2
        PCHF Administrator
        • Sep 2016
        • 15205

        #4
        Originally posted by RufusInPhilly
        I believe I did enable XMP when I originally set things up because I wanted to get the speeds up to 3600mhz (even though that’s too fast for my mobo, apparently)
        Not too fast for the MB but the CPU and unless you closely read the MB specs before making the purchase you would not know, the board specs says compatible with RAM up to 4800MHz (OC) but then goes on to say the below;
        • 10th Gen Intel® Core™i9/i7 CPUs support 2933/2800/2666/2400/2133 natively, Refer to www.asus.com for the Memory QVL (Qualified Vendors Lists).

        Not quite sure how they got to the 4800MHz (OC) because as far as I am aware the fastest RAM that any Intel CPU can handle is an 11th gen i9 that maxes out at 3200MHz.

        You are welcome btw

        Comment

        • RufusInPhilly
          PCHF Member
          • Feb 2022
          • 15

          #5
          Originally posted by phillpower2
          Not too fast for the MB but the CPU and unless you closely read the MB specs before making the purchase you would not know
          Ah ok, well that would explain why I probably went ahead and got the 3600 RAM when I was collecting parts. I have a feeling I would have caught a pretty obvious problem like my mobo not being able to handle that speed, but I might not have read the super fine print.

          Anywayz…

          Unfortunately, the changes you suggested didn’t fix the black screen crash. I actually first set my BIOS to Default so there were no other possible tweaks getting in the way, then I rebooted and went back to set it to Manual, 2933MHz, and 1.35 volts. I also changed the active power scheme to Balanced. The crashes happened several times in a row when I running a 3DMark test. What do you think I should try now? (I updated the MTB report and Speccy snapshot.) Thanks!

          Comment

          • phillpower2
            PCHF Administrator
            • Sep 2016
            • 15205

            #6
            We need to leave Speccy for now.
            Originally posted by RufusInPhilly
            Unfortunately, the changes you suggested didn’t fix the black screen crash.
            Software such as Windows can crash and when it does crash you get a BSOD and when enabled a crash dmp is generated, programs or games when they crash can on occasion close to the desktop but the computer will still be 100% functional.

            Hardware failure such as a weak power supply and/or overheating are not software related and when a computer for example suddenly turns off, freezes or the screen goes black etc the behaviour should be described as the “computer shut down unexpectedly” or froze etc and not as having crashed as the latter implies a software issue as opposed to an obvious hardware issue when described properly.

            Having the correct info means that helpers will not be looking for a software issue when the problem is clearly hardware related.

            [COLOR=rgb(44, 130, 201)]What happened exactly and how were you able to get back into Windows.

            What is suggested;

            To rule out any possible bad settings, restore the MBs default factory settings in the BIOS, save the settings, exit the BIOS, restart the PC then do the below for us;

            Download MiniToolBox and save the file to the Desktop.

            Close the browser and run the tool, check the following options;

            List last 10 Event Viewer Errors
            List Installed Programs
            List Devices (Only Problems)
            List Users, Partitions and Memory size

            Click on Go.

            Post the resulting log in your next reply for us if you will.[/COLOR]

            Comment

            • RufusInPhilly
              PCHF Member
              • Feb 2022
              • 15

              #7
              Oh that’s odd. I had attached the MTB file to my first and most recent post, but for some reason it didn’t stick. Weird.

              Comment

              • RufusInPhilly
                PCHF Member
                • Feb 2022
                • 15

                #8
                Sorry if this is my fault here, but it doesn’t appear that the MTB file show up in the post even though I’m attaching it to my reply. And strangely, when I try to cut & paste the text in the message box itself, I get the message “Oops! We ran into some problems. Please try again later. More error details may be in the browser console.” Do I need privileges to either post attachments or create make long posts?

                Comment

                • RufusInPhilly
                  PCHF Member
                  • Feb 2022
                  • 15

                  #9
                  Well, I uploaded it to my Google Docs account, so I’m hosting it and you can see it here. Thanks!

                  Comment

                  • phillpower2
                    PCHF Administrator
                    • Sep 2016
                    • 15205

                    #10
                    Originally posted by phillpower2
                    [COLOR=rgb(44, 130, 201)]What happened exactly and how were you able to get back into Windows.
                    [/COLOR]
                    [COLOR=rgb(44, 130, 201)]
                    Not sure how you missed answering the above.

                    Not sharing the doc I’m afraid.[/color]

                    Comment

                    • RufusInPhilly
                      PCHF Member
                      • Feb 2022
                      • 15

                      #11
                      Oh I’m sorry about that! I guess I didn’t quite understand what you were asking. Given what’s been happening with my system, I suppose I should just say the “computer shuts down unexpectedly” as opposed to “crashes” since I’m not quite sure of the culprit yet.

                      Here’s exactly what happens: When I play a high-performance game or I test the GPU using Furmark or 3DMark, the screen will go completely black and the fans spin at their maximum. It will stay like this until I hard reboot it by pressing the power button. But once it reboots, the system appears to be completely normal; it boots into Windows 11 just fine, and everything seems to work … that is, until the GPU is stressed again (with a game or GPU benchmark tool).

                      As I said in my original post, this used to happen infrequently when I built it months ago, but now it happens every time. The temps of all the components seem to be within the normal range according to HWiNFO, and the GPU never gets above ~68 C.

                      Hope that makes sense and answers your question better. Here is the MiniToolBar report which I ran after I completely rest the BIOS and then manually changed the RAM speed and voltage as you suggested earlier.

                      Thanks in advance for the help! Apologies again for the confusion.

                      Comment

                      • phillpower2
                        PCHF Administrator
                        • Sep 2016
                        • 15205

                        #12
                        Classic signs of something overheating and looking at the MTB log there is nothing to suggest otherwise, as in there are no problem devices list and the Windows errors that we can see would not cause a black screen and sudden shutdown.

                        Have you made sure that crash dmps are enabled on the computer, see info here for configuring small memory dmps.

                        We need to check the temps and voltages;

                        Download Speedfan and install it. Once it’s installed, run the program and post here the information it shows. The information I want you to post is the stuff that is circled in the example picture I have attached but don`t worry if it does not display the same.



                        So that we have a comparison to Speedfan, download, run and grab a screenshot of HWMonitor (free).

                        To capture and post a screenshot;

                        Click on the ALT key + PRT SCR key..its on the top row..right hand side..now click on start…all programs…accessories…paint…left click in the white area …press CTRL + V…click on file…click on save…save it to your desktop…name it something related to the screen your capturing… BE SURE TO SAVE IT AS A .JPG …otherwise it may be to big to upload… after typing in any response you have… click on Upload a File to add the screenshot.

                        Screenshot instructions are provided to assist those that may read this topic but are not yet aware of the “how to”.

                        Comment

                        • RufusInPhilly
                          PCHF Member
                          • Feb 2022
                          • 15

                          #13
                          Excellent! I’m going to have to get to this at the end of my day (which might be the next one for you), but I’ll do everything you suggested here and get back with a full report. Thank you again so much, Phill!

                          Comment

                          • phillpower2
                            PCHF Administrator
                            • Sep 2016
                            • 15205

                            #14
                            You are welcome and all in your own time (y)

                            Comment

                            • RufusInPhilly
                              PCHF Member
                              • Feb 2022
                              • 15

                              #15
                              Originally posted by phillpower2
                              Have you made sure that crash dmps are enabled on the computer, see info here for configuring small memory dmps.
                              It looks like these crashes aren’t actually creating dump files. Windows is configured correctly to do this, and there are a few dump files from several weeks/months ago. But none are being written as a result of the black screen shutdowns I’ve been having recently. Could this be further evidence that it’s likely a hardware problem?
                              Originally posted by phillpower2
                              Download Speedfan and install it. Once it’s installed, run the program and post here the information it shows. The information I want you to post is the stuff that is circled in the example picture I have attached but don`t worry if it does not display the same.
                              SpeedFan seems to be missing the fan speed section and the voltage section. I’m not sure why though, especially since it looks like it’s recognizing all of the components in my system. (According to the text box right above “CPU usage”, it’s discovered all the hardware.)

                              [ATTACH type=“full” alt=“Speedfan.jpg”]9046[/ATTACH]
                              Originally posted by phillpower2
                              So that we have a comparison to Speedfan, download, run and grab a screenshot of HWMonitor (free).
                              The HWMonitor shows everything, I believe. (Here’s a text file of the exported monitoring data in case there are other #'s you need.) I’m definitely not the expert here, but I think those numbers look more or less okay, no? I wish I could get these readings when I run a GPU stress test, but it shuts down pretty quickly before I can tell how much the system is being taxed.

                              [ATTACH type=“full” alt=“HWMonitor 1.jpg”]9048[/ATTACH][ATTACH type=“full” alt=“HWMonitor 2.jpg”]9049[/ATTACH]

                              I can tell you, however, that several months ago when things were working smoothly, the temp readings of the major components seemed pretty normal. I used HWiNFO64 when I was testing the stability of the system. Nothing went off the charts, the GPU got to about 70 C, and then fans would kick in and cool things down. When the system wasn’t doing much besides running a browser (basically as it is in these screenshots), the GPU would hover between ~40-50 C. But alas, that was back when it was working well and I could monitor things while pushing the performance.

                              Let me know what you think!

                              Comment

                              Working...