PS3 Frankenstein PHAT PS3: CECHA with 40nm RSX

The CELL doesn't appear to be prone to BGA defects and wasn't affected by Bumpgate. IBM seemed to know what they were doing. That's not to say it can't go bad, it can. It's just not defective like the 90nm RSX is.

Of course the CPU does overheat because of old thermal paste between DIE/HS. So that does need changed. This needs to be done with a good tool to adoug killing the CPU. And the IHS needs to be glued back on (with high temp silicone, not super glue). Otherwise you would be creating a defecitive CPU that would eventually get a BGA defect.

The 90nm RSX will eventually go bad. There is no way to tell how long it'll be. But it's not as simple as replacing a defective RSX with a good one. The heat needed puts everything on the board at risk! It should not be looked at as an "upgrade." It is a "repair." You should only consider it to ressurect a dead console. That's why we call it the Frankenstein mod.
But after deliding i just apply the thermal paste between the ihs and cell and put it back carefully so that it does not slip, same for RSX.
Is this method wrong / harmful, is it necessary to stick it back?
 
@squeept (anyone really)...

I have an E01 frankie (40nm) that works great, except for one anoying thing. It errors 3 beeps, red flashing light, on every shutdown. Error 80 1001 or 90 1001. Got any ideas?

Not overheating. Temps are good. Super stable in game. Been testing it fairly extensively, developing custom fan curves. So I'm confident in the RSX install. Clock batt is fine. Av works. HDMI is good. WIFI, bluetooth, PS1/2/3 games...all good. It just 3-beeps on every shutdown! Otherwise a great working frankie. Frustrating!
 
But after deliding i just apply the thermal paste between the ihs and cell and put it back carefully so that it does not slip, same for RSX.
Is this method wrong / harmful, is it necessary to stick it back?
Well the whole point of delidding is to reduce temp swings, which equate to smaller expansion and contraction stress. However, without the IHS to stiffen the interposer, it can flex more with each mini thermal cycle. So yes, the IHS' should be adhered to the processor package using appropriate adhesives. This serves an important function as a brace and will extend the service life of the solder connections (BGA).

You could argue the smaller temp delta might outweigh the gains of having the IHS adhered in place. But there's now way to know without modeling it in CAD and simulating it. And even then it's hypothetical. Real world always differs some.

So is it hurting? Probably. Maby not. But why take the chance? Just put the IHS back on the cell. Don't delid the RSX unless it really needs it (which IMO is rare). If you do, then choosing the right adhesive is an unknown ATM.

We don't know what Nvidia used and are really limited by what's available for that use today. Which is probably going to have a death grip on the VRAM. So it's a 1 time deal. That's why I was thinking of using a graphite thermal pad. It's not as good as decent paste, but it never degrades. So once the IHS is in place, there is never a need to delid again.
 
Last edited:
The last one I remember doing that was a few years ago... I think it eventually crapped out back to YLOD during stress testing. This was before syscon codes and such and the memory is fuzzy, so... that's all I got.

I believe the fix is to remove the speaker... that's definitely straight from the service manual.
 
Other than the RSX, bad or stuck sensor in the disc drive would cause that? Try an entirely different drive, PCB, and cables then see if it still happens?

edit: forgot to @ you @RIP-Felix
 
Last edited:
The CELL doesn't appear to be prone to BGA defects and wasn't affected by Bumpgate. IBM seemed to know what they were doing. That's not to say it can't go bad, it can. It's just not defective like the 90nm RSX is.

Of course the CPU does overheat because of old thermal paste between DIE/HS. So that does need changed. This needs to be done with a good tool to adoug killing the CPU. And the IHS needs to be glued back on (with high temp silicone, not super glue). Otherwise you would be creating a defecitive CPU that would eventually get a BGA defect.

The 90nm RSX will eventually go bad. There is no way to tell how long it'll be. But it's not as simple as replacing a defective RSX with a good one. The heat needed puts everything on the board at risk! It should not be looked at as an "upgrade." It is a "repair." You should only consider it to ressurect a dead console. That's why we call it the Frankenstein mod.
Yes, something weird has been going on with the 90nm RSXs from the start. But this is nothing new, come on.
They obviously fail more than they should for whatever reason. But calling them all "defective" can be misinterpreted very fast.

First of all because some people may read that and start to feel bad about their perfectly working machine when there's no need.

Second because even if there were something wrong about their manufacturing, how can we know how many of them were actually affected by this?
And even if ALL of them actually had this "defect", what does this really tell us on a case by case basis?
Not that much, because this "defect" is just yet another point of possible failure. Just because they have it, it still doesn't mean that every chip will meet the same fate. (And before something else breaks first)

That's without even trying to know the mistery of whats actually going on inside the chips, which is almost impossible to know.

I also dont know what the real ratio of the different kinds of failures is, and I wish I knew. But I'm pretty sure that number is not that close to 100% at all. I'm not even sure if is over 50.

In a similar way, the 40nm chips would be touted as "non-defective", when in reality they still are a notorious point of failure in their respective boards. Perhaps they may no longer have "the defect", but they still share many of the possible points of failure that the old ones had.

Did we already forget about the original post of this thread for example? The official 40nm frankenstein from SONY Japan that still developed RSX related issues.
 
The CELL doesn't appear to be prone to BGA defects and wasn't affected by Bumpgate. IBM seemed to know what they were doing. That's not to say it can't go bad, it can. It's just not defective like the 90nm RSX is.

Of course the CPU does overheat because of old thermal paste between DIE/HS. So that does need changed. This needs to be done with a good tool to adoug killing the CPU. And the IHS needs to be glued back on (with high temp silicone, not super glue). Otherwise you would be creating a defecitive CPU that would eventually get a BGA defect.

The 90nm RSX will eventually go bad. There is no way to tell how long it'll be. But it's not as simple as replacing a defective RSX with a good one. The heat needed puts everything on the board at risk! It should not be looked at as an "upgrade." It is a "repair." You should only consider it to ressurect a dead console. That's why we call it the Frankenstein mod.

Quite interesting postings. Thank you. :)

Thought, since the IHS is some kind of metal, I guess it may expand due the heat more than the board/interposer of the RSX itself...
Maybe this can cause a BGA damage (too), just because of it's glue, it sticks together and introduce a lot more stress to the board and the ball grid?
Could it be that Sony//Nvidia realized this too late, until they ditched the IHS during the 40nm chip series?

Still wondering if a 'free floating' (non-glued) IHS (sticking to the heatsink) would be a bad thing for the 90/65nm ones. ;)
 
  • VDD_MEM = 16.0 Ω
  • BE_VDDC = 1.6 Ω
  • BE_PLL = 1.685 MΩ
  • BE_MC2_VDDIO = 17.25kΩ
  • YC_RC_VDDIO = 12.6Ω
  • RSX_VDDR = 349Ω
  • RSX_VDDC = 1.9Ω
  • RSX_PLL = 325kΩ
  • RSX_FBVDDQ = 229Ω
  • RSX_VDDIO = 96.4Ω
  • VDDA = 3.8Ω
  • 1.7V_MISC = 16.5kΩ
  • 3.3V_MISC = 3.632kΩ
  • 5V_MISC = Fluctuates, rising/falling between 54kΩ to 4kΩ
  • 5V_HDD = 1kΩ
  • 5_USB = 1kΩ
  • 5V_BD = ~4MΩ, not stable it starts falling as soon as it's probed.
  • 12V_BD = 4-5MΩ, fluctuates

Hi RIP-Felix, thanks for all your tutorials. I learned a lot from them! I am new to Ohm testing like this, is there a tutorial or post so that I can learn how to test these values?

I know RSX_VDDR and RSX_VDDC corresponds to some solder balls on the RSX, but how do I test them when the RSX is still soldered on the motherboard?
https://onedrive.live.com/view.aspx?resid=A207CDBFBD6AE582!883750&ithint=file,xlsx&authkey=!APj6-klc3FE6INg
 
Yes, something weird has been going on with the 90nm RSXs from the start. But this is nothing new, come on.
They obviously fail more than they should for whatever reason. But calling them all "defective" can be misinterpreted very fast.

First of all because some people may read that and start to feel bad about their perfectly working machine when there's no need.

Second because even if there were something wrong about their manufacturing, how can we know how many of them were actually affected by this?
And even if ALL of them actually had this "defect", what does this really tell us on a case by case basis?
Not that much, because this "defect" is just yet another point of possible failure. Just because they have it, it still doesn't mean that every chip will meet the same fate. (And before something else breaks first)

That's without even trying to know the mistery of whats actually going on inside the chips, which is almost impossible to know.

I also dont know what the real ratio of the different kinds of failures is, and I wish I knew. But I'm pretty sure that number is not that close to 100% at all. I'm not even sure if is over 50.

In a similar way, the 40nm chips would be touted as "non-defective", when in reality they still are a notorious point of failure in their respective boards. Perhaps they may no longer have "the defect", but they still share many of the possible points of failure that the old ones had.

Did we already forget about the original post of this thread for example? The official 40nm frankenstein from SONY Japan that still developed RSX related issues.
Don't over complicate it. You're falling victim to the complexity trap. The idea that because a system is inherently complex, it can't be broken into it's constituent parts and understood.

I'll make this simple for you.

All 90nm RSX's have defective bump underfill chemistry. It was called Bumpgate. All of Nvidia chipsets from 2006-8 were affected by this. Technically the 65nm RSX's were made during the Bumpgate windo too, they just produce less heat and are thus more reliable because of that. The underfill was improved soon thereafter. The 40nm doesn't appear to be affected. Whether or not this defect manifested was greatly dependent on the thermal environment of the chip. If you won the RSX chip lottery and got one that runs cool, it would last much longer than one on the hotter end of the spectrum. The PS3 thermal design was optimized for sound, not thermals. So the chips are allowed to run hot. And if you don't dust or reapply paste every 5-7 years it can hasten the YLOD. It's about thermal cycles to failure. The greater that temperature difference in each thermal cycle the faster the defect kills the RSX. It's not just BGA cracks, it's also defective bump underfill.

You can reball all you want and that problem will still be there. It needs replaced! The best time would be when you first get the YLOD (3034). So you don't put the board through needless reflow cycles, vainly reballing a defective chip. Better to just put a 40nm on and "hopefully" never need a reball again.

That's what I mean by all 90nm RSX's are defective. They run too hot for their thermal design and underfill chemistry. Thermal expansion leads to inherint unreliability that may crop up under hot conditions. However, you can extend the mean time to failure by reducing the themperature. Which we already know.
 
Quite interesting postings. Thank you. :)

Thought, since the IHS is some kind of metal, I guess it may expand due the heat more than the board/interposer of the RSX itself...
Maybe this can cause a BGA damage (too), just because of it's glue, it sticks together and introduce a lot more stress to the board and the ball grid?
Could it be that Sony//Nvidia realized this too late, until they ditched the IHS during the 40nm chip series?

Still wondering if a 'free floating' (non-glued) IHS (sticking to the heatsink) would be a bad thing for the 90/65nm ones. ;)
No. The CTE of FR4 firberglass (PCB and interposer material) is specifically formulated to match copper for this reason. All other materials chosen in the processor sandwich are formulated to match as closely as material science allows. From Solder bumps chemictry, to underfill chemistry and even the thermal epoxy.

That's actually what wrong with the 90nm RSX's underfil chemistry. It's CTE didn't match up close enough and led to premature failures with thermal cycling.
 
Hi RIP-Felix, thanks for all your tutorials. I learned a lot from them! I am new to Ohm testing like this, is there a tutorial or post so that I can learn how to test these values?

I know RSX_VDDR and RSX_VDDC corresponds to some solder balls on the RSX, but how do I test them when the RSX is still soldered on the motherboard?
https://onedrive.live.com/view.aspx?resid=A207CDBFBD6AE582!883750&ithint=file,xlsx&authkey=!APj6-klc3FE6INg
This is the post you're looking for. Check the links in my signature for much more.
 
Don't over complicate it. You're falling victim to the complexity trap. The idea that because a system is inherently complex, it can't be broken into it's constituent parts and understood.

I'll make this simple for you.

All 90nm RSX's have defective bump underfill chemistry. It was called Bumpgate. All of Nvidia chipsets from 2006-8 were affected by this. Technically the 65nm RSX's were made during the Bumpgate windo too, they just produce less heat and are thus more reliable because of that. The underfill was improved soon thereafter. The 40nm doesn't appear to be affected. Whether or not this defect manifested was greatly dependent on the thermal environment of the chip. If you won the RSX chip lottery and got one that runs cool, it would last much longer than one on the hotter end of the spectrum. The PS3 thermal design was optimized for sound, not thermals. So the chips are allowed to run hot. And if you don't dust or reapply paste every 5-7 years it can hasten the YLOD. It's about thermal cycles to failure. The greater that temperature difference in each thermal cycle the faster the defect kills the RSX. It's not just BGA cracks, it's also defective bump underfill.

You can reball all you want and that problem will still be there. It needs replaced! The best time would be when you first get the YLOD (3034). So you don't put the board through needless reflow cycles, vainly reballing a defective chip. Better to just put a 40nm on and "hopefully" never need a reball again.

That's what I mean by all 90nm RSX's are defective. They run too hot for their thermal design and underfill chemistry. Thermal expansion leads to inherint unreliability that may crop up under hot conditions. However, you can extend the mean time to failure by reducing the themperature. Which we already know.
Hehehe, I see you ended up joining the dark side and make people feel bad about their perfectly working stuff.

I wonder when the "switch" happened because that's not the Felix I met.
I was away for a while. Was it based on some kind of scientific finding or revelation? Or is it now just the more convenient and fitting thing to say? Maybe I just missed something...

Yeah, you mention "Bumpgate" and "underfill" stories,
As if it were some kind of breakthrough but that story is nothing new. I'm sure you also knew about it before, (mainly because I remember myself suggesting something like this "to you" and other people when simple BGA defects were being touted as the single predominant bad guy.
Quick search of my old posts containing word "under" turned stuff like this up:
Screenshot_20220414-151206_1.png
(And at least I didnt really change my mind since then.)
Yes, it may be a thing. But why now, precisely?

I not only knew about that post long time ago, I also know the guy who wrote it and why he did it. Piernov is a great guy, smarter than you and me put together but none of that invalidates the stuff I wrote before.
Even if it directly applied to our RSX.
Which is normally not even that hot at all. Not necessarily. That's a myth, even in standard 90nm machines without maintenance.
Edit:
IMG_20210605_195658---80_1.jpg
This was just old picture example of a regular old machine that was never modified. And this is what normally happens. RSX not that hot at all, among other reasons because the hot CPU needs delid and is pushing the fan higher. But not the RSX. Which indirectly is being cooled more than normal because of the design.

But sure, now that we have the new means to change those chips out, might as well turn them into a scapegoat and blame them more than is fair. Make the victory bigger by making the enemy seem worse after the fact. I'm getting Tokin replacement vibes.
 
Last edited:
That's without trying to get inside the chips and try to know what's actually going on inside them.
"Bumpgte" and "underfill" stories may sound fitting now, but that may be a lazy generalization too. It doesn't explain many of the common cases, like where the chip gets an internal and intermittent short, often between the data lines or to ground. Or when people clearly break some of the weak contacts between RSX and CPU by prying it open to delid, triggering similar error. Or the screen artifacts that respond to board flex, often even appearing after physical damage. Or VRAM failures. Problems that we still see in the 40nm.

Just because some known weakness may be "a thing", doesnt tell you what's actually going on inside your particular chip.

And I didnt even mention reballing now, but thats another important point. Because only a confident reballer can find out if the problem is actually inside the particular chip or not. And get a real feeling of the % of the different kinds of failures.
Not by reading about something general and letting others decide if your chip may be OK or not. But precisely by reballing it and seeing.
(Or if it's extra genius technician like Victor, making RSX socket hehehe)

For example I wonder what @squeept may have to say about this because he is somebody that can say something with a long professional track record.
Maybe now his preferred course of action as a profit driven technician may have changed, but that doesn't change what the underlying problem may be. Also, what about the past?

The 90nm chips may have been not the best, and there surely may have been failures. Some immediate, others after a while. But what about the happy customers? Werent there many too?
Do these "new discoveries" make them scammed somehow because they still have a "defective" chip? No. Many have surely enjoyed a working machine for a long time and they can still keep enjoying it too.

If all or most chips were actually failing anyway because of the "defect", then they'd be out of business very fast, not being able to give warranty no matter what. And I may be wrong, but I don't think this was completely the case.
 
Anyone offering an install for these in the states? I am fully ok with being a test dummy and should something happen I will not be upset. I have a few BC units so if it dies enjoy a parts system.
 
for all people that have cecha and face ps2 issues, can you reflow the CXD 9208 and if that is not doing it. replace it to make sure this is the cause for ps2 broken playback is the chip in yellow circle but im sure you already know that
 

Attachments

  • cecha.jpg
    cecha.jpg
    405.8 KB · Views: 103
Anyone offering an install for these in the states? I am fully ok with being a test dummy and should something happen I will not be upset. I have a few BC units so if it dies enjoy a parts system.

We are offering the service, please join our discord and message us to speak about it.

https://discord.gg/prBfTpCZNr

Wonder if we can start a new thread for people offering the service?? Is anyone else offering the service besides me? Please tell me there is lol
 
We are offering the service, please join our discord and message us to speak about it.

https://discord.gg/prBfTpCZNr

Wonder if we can start a new thread for people offering the service?? Is anyone else offering the service besides me? Please tell me there is lol

So while I would love to start a topic about services the admin will not allow advertising or sales here. It is more for everyone's safety due to so many scammers. Not saying you are just others ruin it for the majority. I will join the discord. I already have the chip and I have two BC models I would like done.
 
I can't really answer anything here in a straightforward manner. My experience is extremely biased because I already start with a subset of defective consoles. We have no way of knowing how many consoles are out there still in regular use that are working normally. Extremely long product life cycles are not normal these days, and we're waaaayyyyy out of the window that they would have considered when designing these old things anyway. The ones that died in 2007 were defective. The chips dying in 2022 after a lifetime of use are not.

Is there still a lot of guesswork and black magic in deciding whether a chip is healthy? Yes. Am I far more confident providing lengthy warranties with 65nm and 40nm swaps? Yes.

I don't know if I'd say that the 90nm RSX is outright defective, but let's put it this way: I'm 6 boards in since being able to do the swaps, and I haven't pulled a 90nm off a board yet that I felt confident putting back on. But if I pull a 90nm that has clear evidence of a BGA defect and the ohms look great, I will still reball it and slap a full year warranty on it without hesitation. Again, though, I must emphasize that using my results and experience to infer anything about overall reliability is a population fallacy.

Or to make my reply even more anecdotal and scatterbrained, many of my customers grill me with questions before purchase, and a common thread in those inquisitions is them mentioning how many backwards compatible systems they've churned through over the years.

edit: maybe instead of defective, we call it.... "not suitable for extended use beyond originally planned product life cycle"
 
Last edited:
Back
Top