Intel 13th and 14th Gen confirmed defective

SpyderTracks

We love you Ukraine
Steve needs a holiday after that 😖

I did notice at the end that Steve mentioned a PugetSystems article on their experiences (as a system builder), and was very interesting to me as I use PugetSystems as my multi-media benchmarking source, and if their ratios carry over to the wider population then it puts these current levels of failures into perspective vs other generations...assuming PugetSystems are not using bespoke settings for voltages, etc.

What annoyed me about the pugent systems data was that they implicitly suggest that applying proper voltage limits reduces failures. Purely from a logical stance, that makes very little sense to me, what’s more logical is that it delays failures past their testing window.
 

TonyCarter

VALUED CONTRIBUTOR
What annoyed me about the pugent systems data was that they implicitly suggest that applying proper voltage limits reduces failures. Purely from a logical stance, that makes very little sense to me, what’s more logical is that it delays failures past their testing window.
Part of the problem (as Steve details) is what is 'proper' power/voltage? As you'd never know based on Intel's confusing terminology and matrices. It's like "don't run it at the official baseline, as that's crap and you won't get the performance we promised, you might as well have bought a 12100F", but also "don't run it at the official performance or extreme levels, unless you want to do a benchmark run for 5 minutes, as it will cook!".

...and if Puget knew about these issue due to the amount of failures in the building and then just tweaked the settings to reduce them (and thus performance), then they've been conning consumers out of performance and are still seeing failures down the line.
 

SpyderTracks

We love you Ukraine
Part of the problem (as Steve details) is what is 'proper' power/voltage? As you'd never know based on Intel's confusing terminology and matrices. It's like "don't run it at the official baseline, as that's crap and you won't get the performance we promised, you might as well have bought a 12100F", but also "don't run it at the official performance or extreme levels, unless you want to do a benchmark run for 5 minutes, as it will cook!".

...and if Puget knew about these issue due to the amount of failures in the building and then just tweaked the settings to reduce them (and thus performance), then they've been conning consumers out of performance and are still seeing failures down the line.
Well, I wasn’t aware of this at all!

This is the guy from Puget

IMG_0056.jpeg
 

ubuysa

The BSOD Doctor
It seems to me that gamers and high-end users are most affected, would that be true? Since the issue seems to be with a power delivery algorithm it does seem to make sense that working the CPU very hard is more likely to expose the problem.

My question to the panel then is, if an affected CPU is run at light loads pretty much constantly, then is the likelihood of a failure much lower? I have the latest motherboard BIOS version installed (the one with the eTVB mitigations and of course the new Intel 'default settings') and because I'm not a power user I'm not seeing any issues. I am running a 65W Intel CPU, it's an i5-13400 and my peak CPU use is always less than 20%.

I also think we need to take care not to run around shouting that 'the house is on fire'. Biased or not, I found this part of the Puget Systems article put things into a bit more perspective...
You can see that in context, the Intel Core 13th and 14th Gen processors do have an elevated failure rate but not at a show-stopper level. The concern for the future reliability of those CPUs is much more the issue at hand, rather than the failure rates we are seeing today. If it is true that the 14th Gen CPUs will continue to have increasing failures over time, this could end up being a much bigger problem as time goes by and is something we will, of course, be keeping a close eye on. 14th Gen isn’t as rock solid as Intel’s 10th or 12th Gen processors, but at least for us, it isn’t yet at critical levels.

Based on the failure rate data we currently have, it is interesting to see that 14th Gen is still nowhere near the failure rates of the Intel Core 11th Gen processors back in 2021 and also substantially lower than AMD Ryzen 5000 (both in terms of shop and field failures) or Ryzen 7000 (in terms of shop failures, if not field). We aren’t including AMD here to try to deflect from the issues Intel is currently experiencing but rather to put into context why we have not yet adjusted our Intel vs. AMD strategy in our workstations.
It's also my understanding that Intel's proposed microcode update will prevent any future damage to processors, and is in that sense a fix, although there's bound to be a performance implication of course. What it won't do is fix an already degraded CPU. Thus, gamers aside, who always want as much performance as they can get, your average Intel CPU user will soon have a permanent fix and is (IMO) less likely to have an issue in the first place.
 

TonyCarter

VALUED CONTRIBUTOR
I don't know how much influence Intel has on it's Board of Advisors' or vice versa, but here's a link to a PDF about who they are and what they do.


It sounds more like a group of top distributors/resellers than Intel shills, but I could be wrong ;)

Just noticed this too...according to Puget, the AM4 and AM5 systems failed MORE OFTEN than 13/14th gen Intel...which doesn't seem to match the failures we've seen or heard about ourselves...
Puget-Systems-Intel-CPU-Failure-Totals-by-Group.webp


Although I'm thinking a lot of those AM4/AM5 failures were not the CPU as in Intel's case, but the dodgy BIOS that was being used at the time, as a BIOS update seemed to fix those failures without damaging the CPU or motherboard - so would be good to see a root cause analysis / failure mode of those instances.
 
Last edited:

SpyderTracks

We love you Ukraine
I have been getting proper hellbent on this, in line with other social issues at the moment, my perception of justice is needing proper nursing.

I'll back off from my opinions and just post any official finidings.
 

Scott

Behold The Ford Mondeo
Moderator
It seems to me that gamers and high-end users are most affected, would that be true? Since the issue seems to be with a power delivery algorithm it does seem to make sense that working the CPU very hard is more likely to expose the problem.

My question to the panel then is, if an affected CPU is run at light loads pretty much constantly, then is the likelihood of a failure much lower? I have the latest motherboard BIOS version installed (the one with the eTVB mitigations and of course the new Intel 'default settings') and because I'm not a power user I'm not seeing any issues. I am running a 65W Intel CPU, it's an i5-13400 and my peak CPU use is always less than 20%.

I also think we need to take care not to run around shouting that 'the house is on fire'. Biased or not, I found this part of the Puget Systems article put things into a bit more perspective...

It's also my understanding that Intel's proposed microcode update will prevent any future damage to processors, and is in that sense a fix, although there's bound to be a performance implication of course. What it won't do is fix an already degraded CPU. Thus, gamers aside, who always want as much performance as they can get, your average Intel CPU user will soon have a permanent fix and is (IMO) less likely to have an issue in the first place.

Your CPU should be fine as it's generally only the 1x600k + chips that are seeing these issues. There may be more, but I think they will be in the minority.

With regards to the failure rate. Data centres are seeing 50%, that's significant. They are also switching over to AMD as they come out of contract, simply because of how bad the failures are impacting them. Class actions and lawsuits etc, along with the pocket hurting server farms do indeed suggest to me that the house is on fire, not to mention the share price.

The gamers and enthusiasts are going to be almost glossed over with how bad the issue is in other sectors. I'm unsure which data Puget are referring to though.
 

Scott

Behold The Ford Mondeo
Moderator
I don't know how much influence Intel has on it's Board of Advisors' or vice versa, but here's a link to a PDF about who they are and what they do.


It sounds more like a group of top distributors/resellers than Intel shills, but I could be wrong ;)

Just noticed this too...according to Puget, the AM4 and AM5 systems failed MORE OFTEN than 13/14th gen Intel...which doesn't seem to match the failures we've seen or heard about ourselves...
Puget-Systems-Intel-CPU-Failure-Totals-by-Group.webp


Although I'm thinking a lot of those AM4/AM5 failures were not the CPU as in Intel's case, but the dodgy BIOS that was being used at the time, as a BIOS update seemed to fix those failures without damaging the CPU or motherboard - so would be good to see a root cause analysis / failure mode of those instances.

I think the below graph is the most telling of the issue. I'm not sure what issues they had with 11th Gen, I certainly don't remember enthusiast issues..... but with that said, was that not a DOA launch where most people didn't bother upgrading?

It's worth noting that Puget seem to have used some common sense when setting up their systems. They have mitigated most of the power issues that they could by using default settings from the beginning (proper default settings) yet they are still seeing an expedited spike in failures.

It reads like wordplay to me as the below shows an issue with 14th gen for sure.

Puget-Systems-Intel-Core-CPU-Failures-Per-Month-and-Generation-1024x481.png
 

SpyderTracks

We love you Ukraine
I think the below graph is the most telling of the issue. I'm not sure what issues they had with 11th Gen, I certainly don't remember enthusiast issues..... but with that said, was that not a DOA launch where most people didn't bother upgrading?

It's worth noting that Puget seem to have used some common sense when setting up their systems. They have mitigated most of the power issues that they could by using default settings from the beginning (proper default settings) yet they are still seeing an expedited spike in failures.

It reads like wordplay to me as the below shows an issue with 14th gen for sure.

Puget-Systems-Intel-Core-CPU-Failures-Per-Month-and-Generation-1024x481.png
Puget are very good, there’s no questioning that they put a lot of effort into their systems and will optimise them exceptionally well.

It’s just that there’s not enough info yet on this latest data they’ve posted. When asked on twitter what BIOS settings they’d applied Jon’s response was as below




IMG_0057.png

I am very interested to see what BIOS settings they’ve applied and also the timeline of testing with those settings as in the main it’s taken roughly 3 - 6 months for deterioration on a healthy processor to reach instability
 

TonyCarter

VALUED CONTRIBUTOR
Wonder if Puget will start to see increasing customer failures as the systems become 3/6/12/24 months older, and what they will do about it considering downtime will be directly be costing their customers money :rolleyes:
 

Scott

Behold The Ford Mondeo
Moderator
Are the benchmarking tests done with the same parameters as the systems that are sent out? If they are then my guess is this microcode update won't have quite the same impact in their testing as it will with others going with the on-board Intel settings during the initial launch.

If they aren't done with the same parameters and are more in line with what Intel recommend prior to their stability tweaks then I guess that should (or has been) communicated to state as such?
 

SpyderTracks

We love you Ukraine
This is just through as well, interview with the Alderon Games CEO who was the first to go public.

Skip to 38:10, it’s worth watching, he goes into quite good detail on it

 

SpyderTracks

We love you Ukraine
There have been 2 beta BIOS released, the latest this morning by MSI

The previous one was with Microcode 0x125
The new one is Microcode 0x129

They're both Beta BIOS, so neither is what would actually be released, on testing, there's very little performance impact, it barely adjusts voltages, I don't think by any means this is what would hit consumers.

PCS I don't think would ever accept flashing a Beta BIOS, just too risky.
 
Last edited:

Ekans2011

VALUED CONTRIBUTOR
Buildzoid provides a good test of microcode 0x129.

The video's main point may be summarised in one sentence (@22:30): the whole issue with this is like the end user is not supposed to have to fix this kind of thing.

 

SpyderTracks

We love you Ukraine
der8auer's microcode test (@04:08)

skip to 8:10 as well, very interesting about Intels official RAM speeds supported, makes zero sense, basically higher end boards with 4DIMMS officially support lower frequencies on RAM than lower end dual slot boards!

Well done Intel, making a lot of sense as always!
 
Top