PS3 Fault finding YLOD with the SYSCON - First steps and Error reporting

Amazing, thanks Bazylski!

I have the DIAG wire already soldered, I'll go on with the patching and let you know.

With my limited understanding of the Syscon, the way it's connected to the sensors is through IC1101 which is basically an AD converter which then talks to Syscon via SMBus - but the sensors are sending an analogue signal out of the CELL. If the actual sensors are faulty, I guess the sensor would be available, just reporting an incorrect temperature. But I may be totally mistaken here.

I'm very curious to see what the sensors say. Is there a way to have a real time reading while powering up? Unless I get a silly temperature when the system is idle...

Not sure I understand the last part of your message, would you like me to check those pads for you so you can compare?

Thanks again!
 
Hey, a bit late to the party maybe but after learning of the NEC/Tokin stuff and syscon developments I decided to get a CECHA01 off of eBay. Warranty seal was gone but it didn't look like it had been worked on (clean board, no warping.) Syscon reported 3034 and 2120 (I had not tried to boot it up at this point.)

Code:
>$ errlog
errlog
ofst[120]:err_code:0xffffffff, clock:0x295cdc55  2021/12/27 20:19:33
ofst[124]:err_code:0xa0403034, clock:0x295cdc6c  2021/12/27 20:19:56
ofst[  0]:err_code:0xa0403034, clock:0x295cdc73  2021/12/27 20:20:03
ofst[  4]:err_code:0xa0403034, clock:0x295cdc97  2021/12/27 20:20:39
ofst[  8]:err_code:0xa0403034, clock:0x295ceb0e  2021/12/27 21:22:22
ofst[ 12]:err_code:0xa0403034, clock:0x295ceb6b  2021/12/27 21:23:55
ofst[ 16]:err_code:0xa0403034, clock:0x29786dfa  2022/01/17 18:12:10
ofst[ 20]:err_code:0xa0403034, clock:0x29786e2b  2022/01/17 18:12:59
ofst[ 24]:err_code:0xa0403034, clock:0x2978731d  2022/01/17 18:34:05
ofst[ 28]:err_code:0xa0403034, clock:0x29787327  2022/01/17 18:34:15
ofst[ 32]:err_code:0xa0403034, clock:0x297874a9  2022/01/17 18:40:41
ofst[ 36]:err_code:0xa0403034, clock:0x29b72320  2022/03/06 07:45:36
ofst[ 40]:err_code:0xa0902120, clock:0x29b72320  2022/03/06 07:45:36
ofst[ 44]:err_code:0xa0403034, clock:0x29b72399  2022/03/06 07:47:37
ofst[ 48]:err_code:0xa0902120, clock:0x29b72399  2022/03/06 07:47:37
ofst[ 52]:err_code:0xa0403034, clock:0xffffffff
ofst[ 56]:err_code:0xa0902120, clock:0xffffffff
ofst[ 60]:err_code:0xa0403034, clock:0xffffffff
ofst[ 64]:err_code:0xa0902120, clock:0xffffffff
ofst[ 68]:err_code:0xa0403034, clock:0xffffffff
ofst[ 72]:err_code:0xa0902120, clock:0xffffffff
ofst[ 76]:err_code:0xa0403034, clock:0xffffffff
ofst[ 80]:err_code:0xa0403034, clock:0xffffffff
ofst[ 84]:err_code:0xa0403034, clock:0xffffffff
ofst[ 88]:err_code:0xa0902120, clock:0xffffffff
ofst[ 92]:err_code:0xa0403034, clock:0xffffffff
ofst[ 96]:err_code:0xa0902120, clock:0xffffffff
ofst[100]:err_code:0xa0403034, clock:0xffffffff
ofst[104]:err_code:0xa0902120, clock:0xffffffff
ofst[108]:err_code:0xa0403034, clock:0xffffffff
ofst[112]:err_code:0xa0403034, clock:0xffffffff
ofst[116]:err_code:0xa0902120, clock:0xffffffff

To verify the theory that 2120 is reported when there's an RSX failure and using HDMI I tried AV cables on my first bringup. Syscon only reported 3034.

Code:
>$ bringup
bringup
[SSM] state: 0000 -> 0101
Bringup Mode #0 (0xFF)
[SSM] ssmCb_OnStartingBePowOn() called.
[SSM] Bringup mode : syspm_stat=00000000/00000000
[POWSEQ] PowerSeq_Setup called.
[SSM] state: 0101 -> 0201
[POWSEQ] AV Backend Setup
[SSM] state: 0201 -> 0102
[SSM] state: 0102 -> 0202
[SSM] state: 0202 -> 0103
[SSM] state: 0103 -> 0203
[SSM] ssmCb_BeforeBeOn() called.
[SSM] state: 0203 -> 0104
Psbd_SbTransMode_Half:0x20e2
>$ shutdown
[POWERSEQ] Error : BitTraining RSX:RRAC:RX3:GLOBAL1:RX_STATUS
[SSM] state: 0104 -> 0304
[SSM] ssmCb_AfterBeOn2() called.
[SSM] PowSeq Fail : Detected !
[SSM] state: 0304 -> 0700
[POWSEQ] AV Backend Letup
[SSM] Shutdown mode : syspm_stat=00000000/00000000
[ERROR]: 0xa0403034
[POWSEQ] PowerSeq_Letup called.
[SSM] state: 0700 -> 0600
(PowerOff State) (Fatal)
shutdown
[SSM] state: 0600 -> 0000
[SSM] Error state is cleared.
(PowerOff State)

So 3034 is either BGA/bumps... Problem is I don't have a BGA rework station and was totally hoping this console wouldn't need a new RSX. Tried "pressure testing" to no avail. Got to reading about bumpgate and all that, seen Luis Rossman's video where he says to heat the chip to 120C-150C for 5 mins and it'll probably work because it's usually the bumps and not the BGA.

No rework station, but I've got a 3D printer. So I took the bed off, insulated it, and used the bed to preheat the board (90C on the bed), attached the print nozzle thermal probe to the RSX with some kapton tape, and after 30 mins or so of preheating (topside temp was about 42C) I took my hot air station gun and heated the RSX to 120C-135C for 5 minutes. I then let it cool down for about 20 minutes.

To my amazement, the console booted right up! No errors, 'bringup' and 'shutdown' commands are clean. https://i.bcow.xyz/cg62MJT.jpg

Bad news is this is obviously a not a fix... I'm going to build/buy a proper rework station and try a frankenstein mod now I think, since this RSX chip is obviously on its way out. This console has almost 1k hours of playtime! I have a few dead boards (non-PS3) to practice with first.

Code:
>$ becount
becount
Bringup : 1188 times
Shutdown: 673 times
Power-on: 40day 16hour 25min 58sec

Thanks to everyone who contributed to this thread thus far, it's been invaluable.
 
Last edited:
Work for days - that's what I like! :biggrin: I never actually thought about the TOKINs - they're shorted! A bit of a backstory on this console - I got it non-working (YLOD) from a person online, it had a massive flux puddle under the CELL/BE and none under the RSX, yet it threw these errors: So I thought, if whoever was in there previously reflowed the CELL and it's still doing the 4421 YLOD, that has to be the RSX. I reflowed the RSX with the tools I described earlier ITT, it worked for approximately 2 months and then it failed on my birthday, in a game of GT6, with minor artifacting prior to the fail. Knowing how it had RSX issues previously, I decided to ask a friend of a friend to replace the RSX with a 40 nm one, presumably eliminating the failed component. I should've done a SYSCON diag instead of guessing. The TOKINs are intact but I guess it's time for them to go, at least on one side. I have enough dead Slims to salvage some tantalums, not to mention the training DYN-001.

Does a shorted TOKIN always mean a dead TOKIN? I'd rather keep them in, but if it's absolutely necessary, they're going out.
A couple of things.
  1. Your description of the original shape this console was in is disturbing! 3034/4xxx errors can be either the CPU or RSX. Normally it's the RSX. Like 99% of the time. However, the fact there was a 1200 (cpu overheat) associated with each event in your log makes me think CPU. And so does the puddle of flux underneath it.
    • Were those errors generated before or after the CPU was reflowed? Because if they were before and the previous owner reflowed the CELL, there's a chance the reflow was sucessful. But if it was after then I suspect the CPU reflow was unsucessful and the only reason your RSX reflow was is becasue of a thermomechanical reconnection due to warping stress, which relaxed and the problem will/did come back.
  2. Just because you read a short at the tokins doesn't mean the short is caused by the tokins. Once you remove them the short may still be present. That means the Tokins were fine. From there the short is either in/under the RSX, or in the VRM side leading upto the tokins. The way to know which it is, is by probing the processor side + rail (output side) or VRM side + rail (input side) when all the tokins are removed. Then you'll know where the short it.
 
Total inexperienced here but my incredibly limited experience with a PS3 phat is that some resistance measurements will read as "short" (particularly if you test in continuity mode) as I believe some parts normally read around 4 ohms - which will make your multimeter beep when probed.
Just my 2p contribution. I totally thought my PS3 was shorted too, then I googled and found evidence that those very low values were totally normal.
 
...the previous owner made a large hole on the bottom case to add a computer fan. PSU was also dead (probably unrelated, I guess the owner just got rid of a few faulty parts!).

Actually, the PSU will overheat and die if the airflow pattern within the case is disrupted. You cannot simply cut a hole in the bottom of the case to reduce CPU/RSX temps. Otherwise the PSU, RF shield and components connected to it via themal pads will all loose the airflow that cools them.

About your CELL overheat issue. Don't get ahead of yourself! Delid the CELL and repaste to rule that out before you get too far down the troubleshooting path. I can tell you from experiance that the CPU can overheat in about 3s without a HS attached. And without good contact from paste on the die it's possable for it to overheat that quick! So do not be too quick to dismiss the simplest explanation just because you assume it would take longer to overheat!
 
It was indeed a silly thing to do!

im not dismissing the IHS but I'd like to take a scientific approach. If the sensor is not available or if it reads 200 degrees when the PS3 is switched off, then my problem lies elsewhere!

re. The PSU frying because of temperature, it's a possibility. I just feel it's unlikely that the PS3 developed a faulty PSU and a thermal issue at the same time - that's why I suspect the seller swapped the PSU with a faulty one before giving it to me. I don't feel scammed, for £5 it's what I was expecting and I purchased this PS3 to have fun repairing it.

but thanks for your input anyways, I'll keep that in mind for sure.
 
... Got to reading about bumpgate and all that, seen Luis Rossman's video where he says to heat the chip to 120C-150C for 5 mins and it'll probably work because it's usually the bumps and not the BGA.
That's called a "heat test," which works by thermomechanical reconnection of the solder joint by warping stress. Just pent up strain the pushes the broken connection back together temporarily. It relaxes and breaks again soon after. If it works, it means either BGA or Bumps are bad. It doesn't get hot enough to reflow either the balls or bumps, which were SAC (tin-silver-copper) Lead free chemistries requiring temps of 210-218C to reflow. It doesn't selectivly rule out the BGA like he implied.
 
I have left you a message on your YouTube post. I have had this exact issue happen to me, in my case it was my own fault as I did a CPU de-lid with liquid metal and didn't add any to the underside of the heat spreader. This meant the CPU dye made no contact with the heat spreader and it over heated immediately. (Just like yours did in your YouTube video). So it's feasible that the CPU dye has lost contact with the thermal paste and heat spreader which causes this instant overheat. When I added LM to the underside of the HS and reassembled it booted fine. So you may need a CPU de-lid and a new application of thermal paste to the CPU dye which could potentially fix your issue.

Sent from my SM-G988B using Tapatalk
 
A couple of things.
  1. Your description of the original shape this console was in is disturbing! 3034/4xxx errors can be either the CPU or RSX. Normally it's the RSX. Like 99% of the time. However, the fact there was a 1200 (cpu overheat) associated with each event in your log makes me think CPU. And so does the puddle of flux underneath it.
    • Were those errors generated before or after the CPU was reflowed?

The errors were before the CPU was reflowed. The fact that there is a short on the RSX makes me believe that the CPU is fine now. The resistance on its TOKIN, CPU side is about 2,5 Ohm. I will resolder the coils and shunts, remove the TOKINs on side A of the RSX part, solder in second-hand tantalums (I don't feel like buying fresh ones because I have a strong gut feeling that this board won't boot ever again), or even better - the store I got the RSX from offers a warranty as long as you send them the whole board for X-ray verification of installation quality. I will remove the TOKINs and if the short on RSX side stays there, I'll send it to the shop for a warranty replacement of the RSX.
If this one fails I'll move all of my data to my still-working-but-having-occasional-red-lines CECHG, and if that fails - I'm giving up on PS3s... I wasn't expecting them to have such a high failure rate after extensive preventative maintenance (delids and all that). Emulation feels like a viable option nowadays.
 
Definitely worth a check, obviously a CPU de-lid is not easy so if you haven't done one before get as much advise as you can before attempting it

Sent from my SM-G988B using Tapatalk
 
Absolutely, I've watched many videos already. I won't use a kitchen knife!
Awww, I thought the Cell was the easy one to delid but no, it's the one with the silicone all around.

For that to be honest I was thinking of using floss string. Should be safer?

I'll gather as much info I can first - then I'll decide what to do.
 
Hi, I tried with fine wire and it wasn't for me. I was fortunate enough to have some motherboards I was using for parts so I practiced on these. It's all about feel and getting used to the resistance of the silicone.

I used a painters knife and I used heat to warm up the heat spreader. In the recording below your notice I am applying pressure with the hand im not using to delid with.


Sent from my SM-G988B using Tapatalk
 
Last edited:
Thanks both.
I did see NSC tool and technique and also videos about using a painting knife.
I'll keep those in mind. thanks for your help! I'll keep you posted with the outcome (actually I'm making a video for my channel with the whole process so hopefully I can share it with you at the end!)
 
That's great, post a link to it when your done. It will be interesting to watch, just remember to go slow and take your time. If you use the floss method which is probably the safest it takes a long time to get through the silicone. However this is probably the safest if you can't practice on some junk boards.

Sent from my SM-G988B using Tapatalk
 
Yes I suspect floss is not the most effective way but, as you say, maybe the safest way for someone without practice.

I'm very curious to see what the temperature sensors read. I found the list of available syscon commands and I'll have fun with them.

Is there a way to see something like the maximum temperature reported or a real time temperature reading? I assume that if the sensors are faulty, the temp reading will be off when in standby as well but it may be useful to see a real time temp reading when the PS3 is attempting to power up.
Thx
 
Is there a way to see something like the maximum temperature reported or a real time temperature reading? I assume that if the sensors are faulty, the temp reading will be off when in standby as well but it may be useful to see a real time temp reading when the PS3 is attempting to power up.
Thx
The value is realtime at the time of issuing the command. As for real-time temperature logging, I don't think there is such a thing in the SYSCON. Your best bet would be typing it really fast as soon as you issue "bringup". The best gauge of maximum temperature the console has experience would be the HDD SMART data, it does contain information on the maximum temperature the HDD has sustained.

Edit as not to doublepost: I've removed 3 out of 4 TOKINs on my COK-002, the + and GND rails on TOKIN pads still show 0,4 Ohm resistance, smells like fried RSX to me. I've contacted the shop and they have a 100% testing before shipping policy, I guess I could try with another RSX a bit later, till then I'll switch over to emulation and copy my saves over. I might try booting the console up in this state to see if it throws a different error - if not, I'll have to put it away :/
 
Last edited:
If the value is real-time then it's ok thanks.

I believe I can use the up arrow to recall commands while running the python script so it would take a moment to recall the temperature command. Even by hand it shouldn't be a big deal.

good idea about checking what that poor HDD had to go through
 

Similar threads

Back
Top