Solved Brand new GPU with multiple strange issues

  • Hi there and welcome to PC Help Forum (PCHF), a more effective way to get the Tech Support you need!
    We have Experts in all areas of Tech, including Malware Removal, Crash Fixing and BSOD's , Microsoft Windows, Computer DIY and PC Hardware, Networking, Gaming, Tablets and iPads, General and Specific Software Support and so much more.

    Why not Click Here To Sign Up and start enjoying great FREE Tech Support.

    This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
Status
Not open for further replies.

AGDeveloper

PCHF Member
Oct 22, 2023
17
1
22
Recently I decided to make an upgrade to my PC, going from a GTX 980 to a RTX 4070 Ti. In doing so, I seem to have unleashed some form of ancient evil which after multiple days of trying to fix it, it has reached a point where I'm completely stumped on what to do and fixing it is well above my knowledge level.

The 3 big issues are:
1. Constant loss of video
2. Weird and intermittent visual glitch, almost like artifacting
3. Occasionally unable to POST with VGA issue which resolves itself for no apparent reason

My specs are are:
Mobo - Msi x370 gaming pro carbon (MS-7A32, Bios ver E7A32AMS.1L0
CPU - Ryzen 7 1700 @ 3.0GHz, rolled back from 3.8GHz
RAM - 2x 8GB Corsair Vengeance LPX DDR4, in slot 1 and 3 (DIMMB2 & DIMMA2)
GPU - Gigabyte GeForce RTX 4070 Ti (https://www.overclockers.co.uk/giga...dr6x-pci-express-graphics-card-gx-1g0-gi.html)
PSU - Phanteks Revolt 1000w (https://www.overclockers.co.uk/phan...ar-80-plus-platinum-cable-free-ca-0cp-pt.html) + Phanteks Revolt Cable Starter Kit
Boot drive: Samsung 980 Evo M.2 NVMe, 250GB. Running Windows 10 on latest version & updates

Here's what's happened when and all the troubleshooting steps I've taken:

PC, before I got the new GPU and Psu, ran fine for years with a MSI GTX 980 and a Novatech Power Station 750w Black Edition.

Upon installing the new components, everything was seemingly fine and was able to get to BIOS.

After moving the PC, I ran into issue 3: PC refusing to post with a VGA issue. This was confirmed by looking at the "EZ Debug" LED which was stuck on VGA, and a series of beeps (1 long, 2 short) which corresponded to a video card error.

I found to get around this issue I simply needed to jiggle the card around a bit, pushing it into the pcie slot and slowly releasing pressure, which eventually worked. This makes me believe that due to its sheer size and weight, the card isn't sitting in the pcie slot correctly, despite it being all the way in and mounted correctly including using the anti-sag bracket. However, this issue does seem to randomly come back every so often, and simply powering the pc down and back up magically resolves the issue.

Once I was properly in to windows after that nightmare, I proceeded to do a driver update via GeForce Experience. It worked fine for an hour or two before suddenly and out of nowhere while idling, both my monitors go black and all fans in the PC ramp up of about a second and a half, before returning back to what I'd expect them to be at idle. Video signal never does come back, and remoting into the pc via Google Remote Desktop greeted me with a blank screen with only the cursor visible, which it would stay until I rebooted. This issue would then reappear between 5 to 10 minutes after logging in.

To try and remedy this, I uninstalled the driver in device manager, to which windows responded by downloading it again. However, this seemed to have solved the issue for about a day, where it eventually returned. In response, I did the same thing, and followed it with nuking the drivers from orbit using the Display Driver Uninstaller utility found online, and letting windows fetch it again. This solved it for a day and a half, before it returned again.

I decided to dig deeper. In Event Viewer I did my best to try and track down what exactly was causing this, and found that Desktop Window Manager had crashed before every occurance. Reliability Monitor also had the same story to tell. I also decided to run a test to see if windows is in any way corrupted ("sfc /scannow") to which there was no issues found. I also tried safe mode, to no avail, and running a clean boot (disabling all services) to also no luck. I decided to re-enable all the services upon confirming it wasn't changing a thing. And rebooted.

Eventually the issue happened again and on rebooting, the issue just decided not to happen... I really cannot explain this as I had done nothing different. It just decided on its own to work... Which lasted all of 2 hours at which point it got fed up and decided to have a snooze with the same issue yet again.

This time however, it's become more severe, as I can't even get to the login screen. It POSTs, windows loads in, and right after it fades to black, and stays there. The monitor is clearly active, but nothing is being displayed, where it stays.

In regards to the strange artifacting-like glitch, it was noticeable during the periods where I could use the PC before it had a mental breakdown. It only ever occurred for the second monitor, on both DP and HDMI. It's definitely not the monitor as plugging in anything else, such as a laptop, or viewing content such as YouTube on it does not make this issue, and it's only ever been present since the 4070 Ti was installed. There's no way to properly test to see if it happens with just the monitor plugged in, as the second my second monitor is not plugged in regardless of DP or HDMI, the same issue of blank screen happens - I can only use both or the second, never the first alone, another issue only created since the 4070 Ti was installed. I have attached a video of me showing it happening when it was about halfway down the monitor, however it does originate at the top to the bottom, and only appears every 3 to 5 mins.

The biggest issue here is the constant loss of video, which currently renders the PC basically unusable, and has gotten so bad that it's literally unusable to the point where I can't access anything. I have tried everything to get it to work, but at this point I'm left at the mercy of what mood it's in at that moment. Posting here is truly my last resort, other than calling in an army of priests to perform an exorcism on it.

I do think I've done a lot more than I've mentioned already to fix it, however I'm so mentally exhausted I can't remember what exactly, so if I've missed anything my apologies!

Thanks in advance :)
 
While troubleshooting stick with just the one screen connected to the GPU.

CPU - Ryzen 7 1700 @ 3.0GHz, rolled back from 3.8GHz

I suggest you restore the MBs default factory settings in the BIOS;

Restoring the MBs default factory settings in the BIOS, they are sometimes listed as one of the following " factory defaults" "most stable" or on newer boards "optimized" please note that if you have both the "most stable" and the "optimized" options in the BIOS you should choose the most stable" option as in this instance the "optimized" settings are a form of overclocking that can cause instability.
Save the new settings, exit the BIOS, restart the computer, test by using the computer as you normally would, post back with an update once you have done this.
 
~trimmed~
Okay so I've reset the BIOS settings back to factory. I also forgot to mention this (knew there was something) but I have done this before as part of troubleshooting this issue by removing CMOS battery, waiting 30 mins, and trying with that to no help as the issue reoccurred not long after, however I've done so again to really make sure after resetting back to factory.

As shown in attached image, it successfully cleared. The only option I changed was turning post beep back on, and confirmed that was the only setting changed. I have been able to boot back into windows successfully after doing so.

Usually it can take anywhere from 5 minutes to up to 24hrs before the issue reoccurs, so unless there's things that I should be trying I'll report back when it happens again, or when 24hrs has passed without issue.

Thanks.
 

Attachments

  • 2.4 MB Views: 2
Not what was suggested and for a couple of very good reasons, restoring the settings in the BIOS following the steps provided does not put anyone or their hardware at risk of harm so keeps people safe + removing the CMOS battery does not restore the MBs best settings.

The only option I changed was turning post beep back on

Good call and an idea to always have this enabled for troubleshooting purposes.

Stick with not changing anything at all for the next couple of days at least and use the PC under load as much as possible to see how you get on.
 
Not what was suggested and for a couple of very good reasons, restoring the settings in the BIOS following the steps provided does not put anyone or their hardware at risk of harm so keeps people safe + removing the CMOS battery does not restore the MBs best settings.
Ah that's my bad then. I always assumed they where one and the same. It's good to know for the future!

And will do, and I'll report back if it happens again.
 
No worries and fwiw, you are comfortable with working inside of the case whereas someone not so savvy could take it upon themselves to do the same thing and have mishap.
 
Yeah that's understandable. I quite like to get hands on, and I've stripped down and rebuilt a pc quite a few times over the years for things like cleaning, troubleshooting etc.. While installing the components I did strip it down fully to clean all the fans, case, stuff like that, and tried to get the spare pc running with my old GPU as my partners laptop is on its way out. Thankfully I tested it before building it back into the case as I found the mobo fried itself, second one that pc has claimed, and I'm done spending more money on getting that old dog back to life. Though apart from the "basics", I don't know much else, I'm more conformable with the software side of things.

Anyways, the issue has happened yet again. I wasn't at my PC when it happened, however it couldn't have been more than 30 minutes between the last check in and when I found it. Same exact issue - Black screen, monitors have signal, unresponsive to even the power button. I wasn't able to remote in to see if it's identical as Virgin Media has gone for a nap leaving me without Internet, however it looked identical to what I'm unfortunately used to by now.

The PC has been idling all day apart from a short 10 minute load from a early afternoon session of Beat Saber. The only loads it's experienced since clearing CMOS have been the same games I played during the 2 stints of it working - Cyberpunk 2077 (Raytracing Ultra preset), Final Fantasy XV (Ultra + 4K Texture DLC) and Star Citizen (Ultra) . The only thing I did differently was monitor gpu temps using hwinfo64 on the off chance it's what's causing it, but the gpu has stayed at around 46C idle and 52C on heavy loads.

I've managed to boot in again just fine this time round and are monitoring it for any more occurances. Strangely, this time round nothing in Event Viewer or Reliability Monitor hints at dwm crashing. The last occurance recorded by both of it happening was the event that led me to post here. But I'll keep an eye on it.
 
Download MiniToolBox and save the file to the Desktop.

Close the browser and run the tool, check the following options;

List last 10 Event Viewer Errors
List Installed Programs
List Devices (Only Problems)
List Users, Partitions and Memory size

Click on Go.

Post the resulting log in your next reply for us if you will.
 
Here's the report, attached to the bottom of this message.

I should stress that at the time of the log no issue has occurred since the last message, and the PC has been on all day idling to try and repeat my last posts conditions of crash (24/10/23 11:27, time now 25/10/23 04:23). And the last error was irregular - no dwm/Desktop Window Manager reported crash in Event/Reliability log, unlike most times.

I also have a speccy log from the first usable boot just after my first post, and now. I can provide it if needed.
 

Attachments

Couple of problems there but will wait and see how you get on before elaborating.
 
Since the last bootup in my last reply, I haven't had the time to stress the PC in the same way due to IRL issues taking priority. However, I did manage to leave it idling all night thursday, to no issues causing me to plug my second monitor in for further testing, and all day Friday without issues (bar one strange recurring one, I'll touch on that later). All day yesterday into today (Saturday to Sunday) I have been stressing it with a few abnormal but stressing tasks to no avail. I cannot recreate the issue whatsoever anymore.

This is strange - I used to get it regularly, but ever since my forum post here, I cannot get it to happen fully even once. I fear I may be dealing with a case of "call tech support because the issue will always fix itself before you get connected". It's the only way I can explain this abnormal behaviour, even more so as bar running MiniToolBox, I have not done anything different and yourself have not suggested anything that I have not already tried beforehand. I can almost guarantee once this topic is closed and my support from here ends it'll re-occur, most likely out of spite more than anything.

In regards to the other issue, I have found that as of recently, the GPU is sending some form of "kill" signal to my monitor, causing it to shutdown. Before, even during the issue happening, my monitor could fully detect the signal coming from my GPU and even turn on and display properly if it detected it during sleep mode - it's default status. It could also go into auto-standby mode just fine, and reawaken to have the PC go back from 1 into 2 monitor mode and it would have signal. However, now, it seems to shut down randomly without cause or reason, it can only detect the signal if both the PC is powered on and the monitor is on before POST OR the cable is unplugged and replugged in. When it does go into the random "sleep", it shuts down fully, something not possible due to me disabling it. It also cannot detect the signal upon its bootup unless the cable is unplugged and reinserted, and best of all, the PC acts like it has a second display despite it being completely off. I have tested it in both HDMI and DP connections, and no dice. Only my second, older and much more basic, monitor seems to have luck keeping the GPU's attention in check, also on DP and HDMI, and with the reverse applied to both. I truly cannot explain this, and the behaviour it's exhibiting is so left field I'm still recovering from the whiplash. However, I'll throw it in with the issues to be fixed alongside the other issues like the artifacting, as it's not a priority.

Attached below are the screenshots of the last dwm crash from before I made this post - from Reliability Monitor and Event Viewer.

I apologise for taking this long to solve this issue, but it's out of my hands and now it's truly proven to be well and above anything I could ever solve on my own. I hope at least the pain this computer is having me endure is causing enough curiosity out of the sheer perplexity it's causing to make this worthwhile.
 

Attachments

No worries how long this takes but it is important that we stick to one thing at a time, going off topic is both unadvisable and frowned upon.

Fingers crossed but you may have resolved the original issue by restoring the MBs default factory settings, give it another couple of days and then post back.
 
Looks like I was too quick to call it done. It has happened again. To make sure it was not a fluke, after bootup I went straight back to what I was doing before and managed to get it to happen again, much faster for whatever reason. Attached are the 2 MiniToolBox logs taken right after bootup after each issue.

What I was doing during both crashes is playing Cities Skylines 2. I've been playing it for 2 days now without issue, and when these 2 crashes occurred I had only just turned the PC on again today since the end of my last session from the late evening into today's very early morning.

The issues happened just like before - screens go black, all fans spin up, and the PC is accessible via remote desktop but nothing but a blank black screen is viewable. When the first one happened, I immediately recognised it and forced a shutdown. On the second one, however, I decided to wait and see what happens if I let it try shutdown on its own (after pressing power button). It worked, but I managed to encounter a related issue I haven't seen since making the initial post here - unable to get to the logon screen. Everytime the PC does the usual load after POST it goes blank and stays that way. I managed to resolve the issue by forcing a shutdown at the blank screen, at which point the next bootup worked just fine, so I seem to have found a way to replicate and "resolve" that at least. Both issues seem to be directly related as this behaviour has not happened without a crash prior.

I have checked Event Viewer and Reliability Monitor after both cases. After the first nothing looked out of the ordinary. After the second however, the familiar dwm crash is present. I have included the screenshot of Reliability Monitor to the post as well.

Edit: After the first crash I also began to monitor temps to see if that may be related while trying again, however it is not. Temps looked just like they have in the times I was checking them well before the crashes.
 

Attachments

What I was doing during both crashes is playing Cities Skylines 2.

How many monitors did you have connected.

Just looking at your first MTB log we can see a major problem and one that has possibly caused Windows to become corrupt, the other errors could be as a result of this so Windows needs to be addressed first.

1 Drive c: (WHOA, DUDE!) (Fixed) (Total:232.34 GB) (Free:26.88 GB) NTFS

See my canned info below;

For Windows to be able to run efficiently and to be able to update you need to have between 20 and 25% of the partition or drive available on a HDD and an SSD between 10 and 15% as free storage space at all times, if you don`t you risk Windows becoming corrupt or not being able to update which puts you at risk of malware attack.

Data only storage devices should not be allowed to get any lower than 10% of free storage space of the full capacity of the drive/partition on the drive, this also to avoid data corruption.

Please note that storage devices can physically fail if the amount of free storage space is allowed to drop below the required 10 or 20/25% minimum.

Uninstall as many unused programs, games, videos and music files as you can and get yourself another means of backing up to, post back when you have between 20 and 25% free storage space on the C: drive/partition and we can go from there.
 
How many monitors did you have connected.

Two. As mentioned in my previous post on Sunday:

... However, I did manage to leave it idling all night thursday, to no issues causing me to plug my second monitor in for further testing, ...

I felt it was stable enough to try at least and see if it made a difference, which it did not and continued to run fine. Since the first crash I've returned to using the second monitor only, and haven't yet tried to properly test it as I've needed the PC to be semi-functional to write emails and access documents.

As for the storage space issue, I will try and do as you suggest and get up to 15% free as it says to do for SSDs, which is what the current install of Windows is on. After which I'll try again with the same games I played to see if it invokes the same response as it did with 2 monitors.

However, this all leads me to believe it's a software issue, and not a hardware issue? At this point I think that's the most important, as if it's just software (i.e Windows) then I'm not too concerned, but if it's hardware then it's a different story. I've cheekily used this issue persisting for so long to have my other half sanction a full on upgrade, with the only missing pieces being RAM arriving this time next week, and the case which is due in on Tuesday. Mobo, CPU and AIO are already here and have been waiting. With the upgrade I plan to nuke the drives anyway for a fresh start so that should resolve it if it's just windows being windows, and this rig will return to using the original PSU and GPU for my partner to have which ran without issues for 5 years before the new GPU caused enough grief to warrant me seeking help from yourself.
 
Only a few short minutes after making the previous post, it occurred again. I did not have a chance to do anything whatsoever - I booted the PC up for the first time today, made the reply here, and was occupied with a phone call when it suddenly happened. MTB log is attached as always. 1 monitor and the only thing running was chrome with only 1 tab, being here.

What's strange about this one is we have the reoccurrence of a different issue this time, which also has not happened since the original post to here, and that is the no vga error beep and debug led. I shut the PC down like I always do when this issue happens, and upon starting it again, it gave me the 1 long 2 short beeps and the debug led stuck at VGA. I tried again and the same thing happened. Third time try and here I am back in to windows, making a second reply. I honestly can't tell if they're related or not.

I have checked Event and Reliability Monitor as always and it's the same story - DWM is present inside a hearse.

Edit: Forgot to mention in the previous post, but I think I did mention the fact that I have checked windows for corruption already back at the start (using instructions provided by Microsoft themselves) to which it reported as not having any and everything checking out.
 

Attachments

Two. As mentioned in my previous post on Sunday:

I was asking about the previous episode and because you were specifically asked to only have the one monitor connected.

While troubleshooting stick with just the one screen connected to the GPU.

As for the storage space issue, I will try and do as you suggest and get up to 15% free as it says to do for SSDs,

That is a canned speech that I wrote and as I asked post back when you have between 20 and 25% free storage space on the C: drive/partition and we can go from there, the 15% minimum is for a fully up to date computer that is not having issues, your computer does not presently come under either category.

Edit: Forgot to mention in the previous post, but I think I did mention the fact that I have checked windows for corruption already back at the start (using instructions provided by Microsoft themselves) to which it reported as not having any and everything checking out.

Your seven consecutive Windows Desktop Manager related crashes suggest otherwise, Windows is broken,
 
Ah. I ran 1 monitor from when you told me to do so up until I said I went back to 2 in that post, and returned back to 1 after the second crash on Wednesday to make the post, which has continued to be the case to now.

And the 15% free space now makes sense. I was in a bit of a rush so got a bit confused, but it's clear now that I need 25% or more. Unfortunately, that won't be possible. The issue has progressively gotten worse and worse, as in, its happening sooner and sooner to the point where I don't even have time to log in to windows before it dies in the exact same way as has been happening. Most I managed to get up to was 40GB free.

Yeah I do realise something with it is very broken, just wanted to let you know that I've at least tried using it's own tools.

Im not sure how to proceed at this point. I can't just call it done and wipe the PC as I still have files and programs which I've not yet backed up, but at the same time it won't stay on long enough to log in let alone continue backing up and freeing space.
 
Status
Not open for further replies.