PS3 Fault finding YLOD with the SYSCON - First steps and Error reporting

VER-001

console starts and shuts down immediately.
Fan spinning like crazy. YLOD/RLOD.
Possibly overheating.

Errlog:
# A0802124 FFFFFFFF
# A0802124 FFFFFFFF
# A0801002 FFFFFFFF

CPU + RSX delid
New TC under IHS

Retest...
Fan is normalized.
Still instant YLOD/RLOD.

Errlog:
# A0802124 FFFFFFFF
# A0802124 FFFFFFFF
# A0801002 FFFFFFFF

HDMI IC removal...

Reading syscon errlog while HDMI IC is removed from mobo.

Errlog:
# A0802024 FFFFFFFF
# A0802024 FFFFFFFF
# A0802124 FFFFFFFF
# A0802024 FFFFFFFF
# A0112124 FFFFFFFF
# A0112124 FFFFFFFF
# A0112124 FFFFFFFF
[...]

Soldering another HDMI IC from doner board and reading syscon errlog.

The console stayed on (GLOD) for a looong time before shutdown.
Time from bringup to poweroff -> 30sec

Errlog:
# A0802124 FFFFFFFF
# A0802124 FFFFFFFF
# A0801002 FFFFFFFF

Assumption -> FAULTY HDMI IC is blocking startup procedure and may cause instant YLOD/RLOD.


Then I waited some time to get a decent oscilloscope for diagnosing NEC/Tokin waveform.

I finally got my hands on RIGOL oscilloscope.
I measured the CPU VDDC during the bringup sequence.
The amplitude is 170mV... which is way above 50mV.

RSX tokins had a bad waveform (page 109)

View attachment 35739

Some measurements:

Probes resistance: 0.3 Ohm
CPU NEC-GND resistance: 5 Ohm
RSX NEC-GND resistance: 3.5 Ohm



I replaced one RSX NEC/Tokin with 4x 470uF tantalum caps.
View attachment 35741


RSX Waveform was normalized (page 110)
View attachment 35740


Console stays on green light and does not shutdown.


Errlog:
clean from errors

No output on HDMI port.
There is output on Component port.

Attached HDD and BD logic board.
Installed OFW 4.86

Console now works, but only on component port.

View attachment 35742

I replaced the hdmi port, as it was internally damaged.
Still no luck with hdmi output.
5V is present on pin 1.

HDMI filters are all checked and ok.

hdmi vbs command returns 0000000 (I assume that it is a checksum of all checks, and it means no errors).


I replaced the HDMI IC back to original one...

>$ errlog
00000000
# CODE CLOCK
# A0112024 FFFFFFFF
# A0112024 FFFFFFFF
# A0112024 FFFFFFFF

So with this original panasonic IC,
it is a constant GLOD (stays on, does not shut down, does not go into service menu, does not give any output neither on hdmi, nor on component).

So the original HDMI IC is faulty for 100%
What's the difference between A080 2024 and A011 2024?

I replaced the HDMI IC to the one taken from doner board.
-> Errlog clear from errors. Console starts.



Things to do:
1. I ordered the HDMI port tester to rule out connectivity issues, and I will do that before attempting the reballing process.
View attachment 35743

2. I have a simple endoscope camera, but it is not enough to inspect bga under the RSX.
The endoscope is 5,5mm...
In any case, It would be this corner and this side I would need to put into bga revision, right?
View attachment 35744

Attempting BGA inspection with microscope...

3. Any other ideas what else can be checked before reballing?

#-------------------------------------
Reballing...
I will need to remove tantals before reballing, as they will most likely fall off the board during the process.

@botakompong seemed to think that a 2024/2124 like this can be "damage to CPU/RSX/SB connection on the Redwood Rambus.
@victor, it seems I made a mistake in the analysis, error 2024 and error 2124, it is a damage in cellbe connection area with rsx or connection area with sub cxd9963 (final rule3), not related to CellBe RAM, I have corrected the damage by replacing 1 set cellbe, now the ps3 is normal, unfortunately I didn't record the clk pulse, it's rare to find a ps3 with such damage. Now I can say RAM Rsx and Ram Cellbe are not included in the syscon error.
Someone can correct me if i'm wrong
He replaced the married Chipset (CPU, NOR, SYSCON, etc.) in a console with 2024/2124, proving that the CPU BGA was at fault, not the HDMI chips. It's that special GLOD BS victor and I have been going over. Seems to be common in Slim/SS models. Not sure if it is in mullion models. They do have 2021/2121 errors that could be the same thing. Still working on this theory.

So it's possable that if the RSX reball fails the CPU is at fault. You might want to reball both, just to rule out bad jounts. And if it persists, then it could be bad RSX (replace). If that's not it, bad CPU (end of the line). Theory and supposition here. Take it with a grain of salt. @vyktormvmpay25 might want to chime in too, since he has much more experience with this.

EDIT: actually based on his responses, I think he's been trying to articulate this all along and I'm just now realizing what he meant. Difficult to describe this problem and there's a nasty language barrier.
 
Last edited:
yes if 2024 wont be sorted by exchange of AV/HDMI ic then problem goes to rsx or worst case to cell.In my ss case 2024 ive quick jumped to exchange av and hdmi ic and started unit.Before was like 2 seconds ylod.Had only one of that.
 
I think this was correct all along:
Your method can distinguish minor differences in the bootup sequence that the SYSCON error codes don't necessarily capture. Mainly because it doesn't generate a code for GLOD situations.

However, if you use the bringup command to start the console it will show the power sequence in the log. If there is an error that occurs before the boot-loader starts, it should show in that log. This works in mullion SYSCON's with internal access, at least. Here is what a COK-001 power sequence should look like normally:
Code:
[SSM] state: 0000 -> 0101
Bringup Mode #0 (0xFF)
[SSM] ssmCb_OnStartingBePowOn() called.
[SSM] First Boot.
[SSM] Bringup mode : syspm_stat=00000000/00000000
[POWSEQ] PowerSeq_Setup called.
[SSM] state: 0101 -> 0201
[POWSEQ] AV Backend Setup
[SSM] state: 0201 -> 0102
[SSM] state: 0102 -> 0202
[SSM] state: 0202 -> 0103
[SSM] state: 0103 -> 0203
[SSM] ssmCb_BeforeBeOn() called.
[SSM] state: 0203 -> 0104
Psbd_SbTransMode_Half:0x20e2
[SSM] state: 0104 -> 0204
[SSM] state: 0204 -> 0105
[SSM] state: 0105 -> 0400
(PowerOn State)
[SERV NVS] READ CMD
If there is an error, it should generate a YLOD and say where in the PowerSEQ it failed, with an error code. If everything is normal, the PowerSEQ only takes about half a second to complete. If the tokins are going out or there is some kind of issue with the power it can take longer. So that can clue you into an possible error. However, in the case of a GLOD it often stalls out in the bootloader which attempt to load just after the PowerSeq. I've seen a bunch of errors and retrys in the log.

I also want to circle back to something I noticed before that hasn't seemed to gain traction. And it's the Bittraining error that can be seen in the lasterrlog. Here's an example: View attachment 33782
This was PS3#4, a COK-001 that only had a 1.5s YLOD and a single 3034 error. I traded this board to @squeept for a motherboard that had bad tokins. He reballed the RSX, but it changed to a GLOD with syscon errors 1001. He attempted the first frankenstein mod on it.

The reason I bring it up is that if you look at the bittraining error it says the problem is in the "BE:RRAC:RX2:GLOBAL1:RS_STATUS." BE is the Cell processor and RRAC I recongnize from the "RRAC VDDIO Bypassing" section of the schematic. I came across it last night when I was trying to figure out which BE voltage @botakompong was referring to as C3. It's the +1.2v_YC_RC_VDDIO. If you search the CELL pinout wiki for RX2 you can find the pin cordinates for the RC_VDDIO (AD37-41 & AC34-36). It's literally telling you where the signal degraded! IF you measure the voltage at C3 and it's good. Then reball the CPU to fix it! I totally missed that.View attachment 33786
This explains @squeepts failure to fix this console. He was focusing on the RSX when it was a Cell reball that's needed! And any subsequent frankenstine mod was doomed unless this problem was fixed first!

At the time I thought this analysis was wrong, because the RSX can cause FlexIO errors that register as BE bittraining errors. But now we know that Bittraining errors don't always occur in damage to the FlexIO interface. Perhaps the only appear when they affect the SPI data line between RSX/CELL. If they affect the SB/CELL or RSX/VDDIO they manifest as a GLOD.
  1. Normal GLOD (RSX <--> VDDIO) = BGA/Bump defects that don't register a YLOD or syscon error during the Power On Sequence, but can register 2020/2120 or 2022/2122 errors (2024/2124 in Slim/SS models). First step is to rule out HDMI/DVE failures. Check TH2510/TH2401 and related SMDs. Measure voltages are good. If so, reball RSX to rule out BGA. If that doesn't work Replace with known good chip to rule out Bumps. At this point the RSX side is good.
  2. Special GLOD (CPU <--> SB) = BGA/Bump defects that don't register a YLOD or syscon error during the Power On Sequence, but can register 2020/2120 or 2022/2122 (2024/2124 in Slim/SS models) errors. Double check CPU voltages and nearby SMDs. Flash NAND/NOR to rule out FW corruption. If that fails reball CPU to rule out BGA. If that fails, GAME OVER. Well...you can swap the entire chipset from a donor board if you feel up to it. You'd need a board that's borked in some way that doesn't affect CPU, SYSCON, NAND/NOR, etc to harvest the married components. Or I suppose you could try to marry a CPU.
 
Last edited:
I think this was correct all along:


At the time I thought this analysis was wrong, because the RSX can cause FlexIO errors that register as BE bittraining errors. But now we know that Bittraining errors don't always occur in damage to the FlexIO interface. Perhaps the only appear when they affect the SPI data line between RSX/CELL. If they affect the SB/CELL or RSX/VDDIO they manifest as a GLOD.
  1. Normal GLOD (RSX <--> VDDIO) = BGA/Bump defects that don't register a YLOD or syscon error during the Power On Sequence, but can register 2020/2120 or 2022/2122 errors (2024/2124 in Slim/SS models). First step is to rule out HDMI/DVE failures. Check TH2510/TH2401 and related SMDs. Measure voltages are good. If so, reball RSX to rule out BGA. If that doesn't work Replace with known good chip to rule out Bumps. At this point the RSX side is good.
  2. Special GLOD (CPU <--> SB) = BGA/Bump defects that don't register a YLOD or syscon error during the Power On Sequence, but can register 2020/2120 or 2022/2122 (2024/2124 in Slim/SS models) errors. Double check CPU voltages and nearby SMDs. Flash NAND/NOR to rule out FW corruption. If that fails reball CPU to rule out BGA. If that fails, GAME OVER. Well...you can swap the entire chipset from a donor board if you feel up to it. You'd need a board that's borked in some way that doesn't affect CPU, SYSCON, NAND/NOR, etc to harvest the married components. Or I suppose you could try to marry a CPU.
I'd call normal GLOD to those cases when you simply don't get image but, at some point you know which is the cause. Well, at least that's what I know all people were saying when you get only green light on startup and no image. There're cases when you have both A/V-HDMI chips damaged and console still runs, HDD still reads, your BD drive still reads discs, etc, but there're cases when nothing of this happens. I don't want to talk more than I know, but I'd like to share a couple of symptoms I've found during my journey with these awesome consoles. Perhaps, I'm a virgin with the Syscon method at the moment, I really don't have the time and vibes to start reading page by page of the awesome content you all shared, it's a lot of stuff and don't want to read it by pieces. So I can't share any of that data with you but..

Well, something that I mentioned to @vyktormvmpay25 is a couple of GLOD I faced.

* GLOD with a MLC cap in short by the Panasonic chip (chip was damaged), in a SS. This GLOD was preventing to boot the CELL, it was cold as winter. In this case the GLOD was infinite. As a note, the A/V chip was damaged too.
* GLOD with a 39 seconds (time depends of the case/mobo, I guess) until console turns off. This is the one that I couldn't fix yet, I asked Vyk if it could be the SB, the NOR is good and dump looks good too. But, I saw some semi-bricks doing the same thing, so I'm not sure yet.
* GLOD similar to the first one, in this case I can't confirm if the CELL has temperture or not, it happens when you cut some traces that don't trigger a YLOD, obviously this could be the same as a CELL with bad BGA contact, or maybe not.

But as you state, real GLOD has to do with CELL, unless you have something damaged and it's scrambling the initial boot, as the case of the HDMI scaler.
 
Ok finished porting all parts , same issue on quick test, going to record one short video.
Yes because of cell. No fatal errors all this crossing board from sur001 to jsd.
Up to now all fine only rsx exchange.
I've just wanted to install Linux but that need to wait.
http://s.go.ro/bk2a9o2x
 
Last edited:
VER-001

console starts and shuts down immediately.
Fan spinning like crazy. YLOD/RLOD.
Possibly overheating.

Errlog:
# A0802124 FFFFFFFF
# A0802124 FFFFFFFF
# A0801002 FFFFFFFF

CPU + RSX delid
New TC under IHS

Retest...
Fan is normalized.
Still instant YLOD/RLOD.

Errlog:
# A0802124 FFFFFFFF
# A0802124 FFFFFFFF
# A0801002 FFFFFFFF

HDMI IC removal...

Reading syscon errlog while HDMI IC is removed from mobo.

Errlog:
# A0802024 FFFFFFFF
# A0802024 FFFFFFFF
# A0802124 FFFFFFFF
# A0802024 FFFFFFFF
# A0112124 FFFFFFFF
# A0112124 FFFFFFFF
# A0112124 FFFFFFFF
[...]

Soldering another HDMI IC from doner board and reading syscon errlog.

The console stayed on (GLOD) for a looong time before shutdown.
Time from bringup to poweroff -> 30sec

Errlog:
# A0802124 FFFFFFFF
# A0802124 FFFFFFFF
# A0801002 FFFFFFFF

Assumption -> FAULTY HDMI IC is blocking startup procedure and may cause instant YLOD/RLOD.


Then I waited some time to get a decent oscilloscope for diagnosing NEC/Tokin waveform.

I finally got my hands on RIGOL oscilloscope.
I measured the CPU VDDC during the bringup sequence.
The amplitude is 170mV... which is way above 50mV.

RSX tokins had a bad waveform (page 109)

View attachment 35739

Some measurements:

Probes resistance: 0.3 Ohm
CPU NEC-GND resistance: 5 Ohm
RSX NEC-GND resistance: 3.5 Ohm



I replaced one RSX NEC/Tokin with 4x 470uF tantalum caps.
View attachment 35741


RSX Waveform was normalized (page 110)
View attachment 35740


Console stays on green light and does not shutdown.


Errlog:
clean from errors

No output on HDMI port.
There is output on Component port.

Attached HDD and BD logic board.
Installed OFW 4.86

Console now works, but only on component port.

View attachment 35742

I replaced the hdmi port, as it was internally damaged.
Still no luck with hdmi output.
5V is present on pin 1.

HDMI filters are all checked and ok.

hdmi vbs command returns 0000000 (I assume that it is a checksum of all checks, and it means no errors).


I replaced the HDMI IC back to original one...

>$ errlog
00000000
# CODE CLOCK
# A0112024 FFFFFFFF
# A0112024 FFFFFFFF
# A0112024 FFFFFFFF

So with this original panasonic IC,
it is a constant GLOD (stays on, does not shut down, does not go into service menu, does not give any output neither on hdmi, nor on component).

So the original HDMI IC is faulty for 100%
What's the difference between A080 2024 and A011 2024?

I replaced the HDMI IC to the one taken from doner board.
-> Errlog clear from errors. Console starts.



Things to do:
1. I ordered the HDMI port tester to rule out connectivity issues, and I will do that before attempting the reballing process.
View attachment 35743

2. I have a simple endoscope camera, but it is not enough to inspect bga under the RSX.
The endoscope is 5,5mm...
In any case, It would be this corner and this side I would need to put into bga revision, right?
View attachment 35744

Attempting BGA inspection with microscope...

3. Any other ideas what else can be checked before reballing?

#-------------------------------------
Reballing...
I will need to remove tantals before reballing, as they will most likely fall off the board during the process.




NEW HDMI IC just came, and I soldered it.
Same issue.

errlog is clear from errors.
No output on hdmi port.
Console works on Component port.


NEXT.... RSX reballing... doh...
 
Here received today two packages from users in forum. We will see what we can get sorted and and later will see more Frankenstein units and hopefully more gaming tests.
Edit
@RIP-Felix got something nice to test.
73bc132eb81d9e6b3f37d279742b29d7.jpg
 
Last edited:
You could ohm test the RSX to be sure its good. VDDC to GND should be 2-4 ohms. You should check all the other voltage resistance too, but IDK what good separation is. Probably anything over 2 ohms. IDK about the high range though. So you might miss an open fault. And of course, any one pad could have dead diodes internally and you cant be expected to check every one of them. So its more qualitative test than quantitative.
 
yes vdd line,good more ppl are into reball
3,5 ohms for 65nm is best working condition or 5 ohms as well, I will add some pics to check before reball to understand if all went fine,im pretty sure you get it working as 65 nm is easy not like most 90nm dead
Edit
I usually look for one point of each colour power lines, take note of each resistance and use it as reference for 65nm if this unit will start. Take note of ram type, can't remember if they used 2 different models of ram or Samsung only.
e4995914c310ef56ad80d8e0a127af00.jpg
 
Last edited:
yes vdd line,good more ppl are into reball
3,5 ohms for 65nm is best working condition or 5 ohms as well, I will add some pics to check before reball to understand if all went fine,im pretty sure you get it working as 65 nm is easy not like most 90nm dead
Edit
I usually look for one point of each colour power lines, take note of each resistance and use it as reference for 65nm if this unit will start. Take note of ram type, can't remember if they used 2 different models of ram or Samsung only.
e4995914c310ef56ad80d8e0a127af00.jpg

AQUAMARINE 25 Ohm
BLUE 0355 MOhm
RED 2.5 Ohm
HARLEQUIN 0.63 MOhm
GREEN 0.53 kOhm
YELLOW 0.264 MOhm
YELLOW/// OL
BROWN 0.815 kOhm
PURPLE 1.44 kOhm
 
Seems within reason, RED = VDDC is a little low, so RSX is worn, but not dead yet. If it were below 2 I would be worried it could die from the reball.
 
Seems within reason, RED = VDDC is a little low, so RSX is worn, but not dead yet. If it were below 2 I would be worried it could die from the reball.
Well, I compared with other RSX'es that are dead.

All of them had 0.3Ohm on VDDC.
Aaaand one RSX with a blown ram-chip also has 0.3 Ohm


A spare one (from doner board) has 3 Ohm on VDDC (same nm architecture).
 
Mine or his? He probably didn't make it. Mine got something new and similar to slim in cok001 but this time with 40nm rsx and modchip orbis . Need few more days to come with all tests.
 
Been trying to find somewhere to this, sorry if this is off topic. I have been able to fix GLOD with extreme pressure on the CPU. I have tried a reflow on the RSX and CPU and this may work for a little while but would always eventually fail. So using the CPU pressure tick to lower temps on the CPU can also fix GLOD by simply adding additional pressure. I have achieved this now on 3 CECHC03 consoles with none of them showing any signs of failure after days of usage. Before the screen would freeze and show graphical glitches, now it runs flawlessly, no glitches nothing. I use thermal pads to create the mod. If it boots but fails I simply add more pads until it runs without failure.
 
@RIP-Felix on last test DeadEnd was right, all this special glod was rsx all the time, two different models same rsx killed by myself in delid process. Before cases of special glod were most untouched boards. All were low resistance 1.8 ohms on vdd and since on board, I just desoldered and exchange. With this rsx I have was something tricky as vdd line is still perfect 3 ohms and FBvdd 220 ohms.
At least we can know now if we can see SB debugging starts in syscon uart not necessarily to go for SB uart.
I didn't know until today so as quick test reference recovery beeps work no image on any port, don't loose time, search another rsx.
Usually should work straight away not any doubt anymore.
 
Last edited:
Makes me realise I have a lot to learn when I try to understand the above message. Reading syscon is something I definitely need to get into. So it looks some GLOD are due to the connections undernthe CPU substrate. ON all 3 CECHC03 I did the pressure mod as a last resort. So the board had been heated to 160 degrees and a reflow of the CPU and GPU had been done. This made it go from black screen bno signal to at least trying to start. When the pressure was applied to the CPU it booted. On every occasion I have had to add additional pressure as on the first attempt it would freeze with graphical errors. With more added pressure the second attempt fixed the issue. Over 40 hours to date on these consoles and still no signs of failure. So what's going on, why does this work?
 

Similar threads

Back
Top