High GPU stress causing black screen crash

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • phillpower2
    PCHF Administrator
    • Sep 2016
    • 15205

    #16
    As per my replies #6 amd #12 Rufus;
    Originally posted by phillpower2
    Software such as Windows can crash and when it does crash you get a BSOD and when enabled a crash dmp is generated, programs or games when they crash can on occasion close to the desktop but the computer will still be 100% functional.

    Hardware failure such as a weak power supply and/or overheating are not software related and when a computer for example suddenly turns off, freezes or the screen goes black etc the behaviour should be described as the “computer shut down unexpectedly” or froze etc and not as having crashed as the latter implies a software issue as opposed to an obvious hardware issue when described properly.

    Having the correct info means that helpers will not be looking for a software issue when the problem is clearly hardware related.
    Originally posted by phillpower2
    Classic signs of something overheating and looking at the MTB log there is nothing to suggest otherwise, as in there are no problem devices list and the Windows errors that we can see would not cause a black screen and sudden shutdown.
    No worries about Speedfan we just get what we can from it and is why I said don[ICODE]t worry if it does not display the same, HWMonitor still gets updated whereas Speedfan does not I[/ICODE]m afraid.

    Could be something and nothing but HWMonitor shows that the maximum speed of the CPU cooling fan is/was 663rpm, check the maximum CPU fan speed setting in the BIOS along with the thermal shutdown setting.

    Everything else in both programs look ok.

    Comment

    • RufusInPhilly
      PCHF Member
      • Feb 2022
      • 15

      #17
      Originally posted by phillpower2
      Could be something and nothing but HWMonitor shows that the maximum speed of the CPU cooling fan is/was 663rpm, check the maximum CPU fan speed setting in the BIOS along with the thermal shutdown setting.
      I think the max CPU fan was low in the last post because I had basically just powered the system up and was only using Chrome for a few minutes. But here’s a snapshot of the system after doing several more demanding tasks. I’m pretty sure the fans get going once there’s a demand.

      [ATTACH type=“full”]9060[/ATTACH]

      Anyway, so what you’re saying is that it’s probably something overheating (as opposed to a weak power supply), correct? And considering this only happens when the GPU is stressed, is it safe to say the video card is likely the problem? I didn’t mention this before, but I’ve run both CPU and RAM stress tests with Cinebench & MemTest86 several times, and there were never any obvious issues.

      Also, is there a way to drill this down further with other diagnostic tools/methods? Obviously, I’m not able to monitor the temps or voltages at the moment since the system will always shut down when the GPU is pushed. But could I, for example, under-volt the GPU and see if there’s a breaking point perhaps? (I’m probably out of my depth here, btw.)

      Last thing: I have this old GTX 1660 from my previous build, and as of two years ago, it worked just fine. If I swapped that into the current system and everything ran well during 3DMark & FurMark tests, would almost certainly mean my current RTX 3070 from Zotac is definitely faulty? Maybe I’m just being optimistic, but I feel like defective hardware is a lot less common than people think, but I dunno, perhaps I should face that possibility … even though I got it brand new directly from Zotac, and it hasn’t really gotten a ton of use.

      Comment

      • phillpower2
        PCHF Administrator
        • Sep 2016
        • 15205

        #18
        CPU fan speeds ruled out but did you also check what the thermal shut down setting was in the BIOS.

        Swapping in the GTX 1660 is an idea but being that it draws a lot less power than the RTX 3070 any outcome would not be conclusive, try it and see how it goes in any event.
        Originally posted by RufusInPhilly
        Also, is there a way to drill this down further with other diagnostic tools/methods?
        We are nearly out of options and you are fast approaching the need for a local tech to do a proper bench test.
        Originally posted by RufusInPhilly
        Maybe I’m just being optimistic, but I feel like defective hardware is a lot less common than people think,
        Sorry but you are so far off the mark, the mass producers from the east have flooded the world with not only sub standard goods over the years but also out and out counterfeit items and this not just small items like PSUs and RAM but whole computers made looking the exact same just with no Dell or HP etc logo on them.

        Comment

        • RufusInPhilly
          PCHF Member
          • Feb 2022
          • 15

          #19
          Originally posted by phillpower2
          CPU fan speeds ruled out but did you also check what the thermal shut down setting was in the BIOS.
          I’m not 100% sure if this is what I should be looking for (see pic below), but for my ASUS mobo it seems the “Maximum CPU Core Temperature” is set to “Auto,” as are a few other related settings on this page. There doesn’t seem to be an actual “thermal shut down” setting though, at least not that I’m seeing.

          Except for the RAM tweak I made above, all the BIOS are set to default, but it doesn’t say what that threshold currently is. Is this something I should adjust do you think? Shouldn’t I be looking for GPU-related settings since it seems the CPU stress tests don’t cause the shutdowns? If I do need to adjust this, what should I set it to?

          [ATTACH type=“full” alt=“bios.jpg”]9083[/ATTACH]
          Originally posted by phillpower2
          Swapping in the GTX 1660 is an idea but being that it draws a lot less power than the RTX 3070 any outcome would not be conclusive, try it and see how it goes in any event.
          Well, it seems the old GTX 1660 works just fine in this system. I ran 3DMark several times and Furmark for 15+ minutes, and although the benchmarks/scores were worse than the newer RTX 3070 of course, no shutdowns happens and everything ran smoothly. But as you said, this may not be telling since it pulls less power.
          Originally posted by phillpower2
          Sorry but you are so far off the mark, the mass producers from the east have flooded the world with not only sub standard goods over the years but also out and out counterfeit items and this not just small items like PSUs and RAM but whole computers made looking the exact same just with no Dell or HP etc logo on them.
          Sigh. I guess I was just hoping there’d be something I was overlooking … rather than have to face the possibility of needing to find a new GPU during this never-ending shortage.

          Comment

          • phillpower2
            PCHF Administrator
            • Sep 2016
            • 15205

            #20
            Not it I’m afraid, you are looking for something that tells you at what temperature the MB will shut down to protect the CPU from frying, check under the Advanced tab, this check is something just to tick on the check list and restoring the MBs default settings would make sure that the CPU will shut down at a safe temperature.

            The cause of the black screen could still be either the GPU or the PSU so you are at the stage now were you need to either get the PC tested by a local tech using their PSU and your RTX 3070 or RMA the GPU which was not released until Oct 2020 and so should still be covered by warranty the standard Zotac 2 year warranty.

            Comment

            • RufusInPhilly
              PCHF Member
              • Feb 2022
              • 15

              #21
              Originally posted by phillpower2
              Not it I’m afraid, you are looking for something that tells you at what temperature the MB will shut down to protect the CPU from frying, check under the Advanced tab, this check is something just to tick on the check list and restoring the MBs default settings would make sure that the CPU will shut down at a safe temperature.
              I’ll keep digging around a little bit in the BIOS. This was from the Advanced tab, but there are a ton of settings and I’m not very familiar with them in general.
              Originally posted by phillpower2
              The cause of the black screen could still be either the GPU or the PSU so you are at the stage now were you need to either get the PC tested by a local tech using their PSU and your RTX 3070.
              Just so I’m clear, IF the PSU is the problem, you’re saying it would be because it was defective or something, not because an 850 Watt 80+ Gold isn’t providing enough power for this system, correct? I used a power supply calculator and got these results, and I even bumped up the recommendations just to be safe, so I assume it’s getting enough power – in theory.

              Comment

              • phillpower2
                PCHF Administrator
                • Sep 2016
                • 15205

                #22
                The brand and spec of PSU are not in question but there is always going to be one that is defective so we can`t rule it out, the GPU is the most likely of the two though.

                The CPUs thermal shutdown trigger is of no consequence now + you have also 100% ruled out a software issue for yourself now being that you would not take my word for it

                Comment

                • RufusInPhilly
                  PCHF Member
                  • Feb 2022
                  • 15

                  #23
                  Originally posted by phillpower2
                  The brand and spec of PSU are not in question but there is always going to be one that is defective so we can`t rule it out, the GPU is the most likely of the two though.
                  Well time will tell now because Zotac accepted my return merchandise authorization request, so hopefully in a couple of weeks they’ll either send me back a brand new RTX 3070 or tell me it was totally fine all along. Also, I swapped out my old Seasonic 750 watt gold PSU into the current system, The same shutdown problem was happening. It’s slightly less powerful than the 850 watt but still should have been enough. So fingers crossed that Zotac rectify things. ???
                  Originally posted by phillpower2
                  The CPUs thermal shutdown trigger is of no consequence now + you have also 100% ruled out a software issue for yourself now being that you would not take my word for it
                  Wouldn’t take your word for it? Huh?

                  Anyway, I’ll keep you posted. Not sure if you want to keep this thread open until then, but I’ll let you know how everything turns out!

                  Comment

                  • phillpower2
                    PCHF Administrator
                    • Sep 2016
                    • 15205

                    #24
                    Good news on the GPU at least (y)

                    The techs will most likely stress test your card, remove the fans, clean off the thermal compound, apply a fresh amount, put new fans on, test and if all ok send you the same card back.
                    Originally posted by RufusInPhilly
                    Wouldn’t take your word for it? Huh?
                    Sorry, a bit of mischief making on my behalf but you did take a bit of convincing that the behaviour would not be caused by software.

                    As an asides;
                    Originally posted by RufusInPhilly
                    I swapped out my old Seasonic 750 watt gold PSU
                    What a coincidence, I came across my first ever bad Seasonic PSU the other day, Seasonic not only replaced the PSU but also sent a 650W replacement for the original 550W PSU and took two years off the 650W PSUs ten year warranty giving the user eight years left, not bad when the original PSU only had a five year warranty and was two years old.

                    We can`t leave a thread hanging for weeks so will give it a couple of days in case you need to add anything further, if not we will close it and you can send one of us a PM when you are ready and the thread can be reopened.

                    Good luck and thanks for keeping us up to date (y)

                    Comment

                    • RufusInPhilly
                      PCHF Member
                      • Feb 2022
                      • 15

                      #25
                      Thanks so much for all your wonderful help! I really appreciate all the time you’ve given this issue. No problem about need to close the thread, of course. I’ll write/post again when I get a response from Zotac. The officially received the GPU a few days ago, so not it’s up to them, and hopefully it won’t be long. Thanks again!

                      Comment

                      • phillpower2
                        PCHF Administrator
                        • Sep 2016
                        • 15205

                        #26
                        You are welcome and thanks for the update (y)

                        We will leave your thread as pending until Saturday just in case you hear back from Zotac.

                        Comment

                        • RufusInPhilly
                          PCHF Member
                          • Feb 2022
                          • 15

                          #27
                          I just heard back from Zotac that the GPU failed their tests, so they’re sending me a replacement. It’s a bummer that it was defective and I had to go through this, but I’m so relieved that I now know what the problem is and don’t have to keep trying to solve the mystery anymore.

                          Anyhow, I guess you can go ahead and mark this thread as officially “solved.” And thank you again and again, phillpower2, for all your awesome help! You’ve been awesome!

                          Comment

                          • phillpower2
                            PCHF Administrator
                            • Sep 2016
                            • 15205

                            #28
                            While not ideal at least the issue is now resolved and while the GPU is covered by warranty which is an added bonus due to the scarcity and expense of GPUs still.

                            Fair play to Zotac as well (y)

                            You are welcome btw and thank you for the kind words

                            Comment

                            Working...