PS3 Fault finding YLOD with the SYSCON - First steps and Error reporting

Felix just asked on the tokin thread: "I do about 20 hours total with at least 10 full heat cycles and one overnighter. About an hour hands on games, then the rest is rubberband on the thumbstick spinning in circles outside in whatever FPS I have laying around."

Yeah, they occasionally crap out. I stress test anything Xbox 360 / PS3 or newer like that regardless of what the problem was. It's too common for BGA issues to be hiding while the console develops other issues. So I replace a laser and then it dies during stress testing and goes back to the workbench for a reball.
 
Felix just asked on the tokin thread: "I do about 20 hours total with at least 10 full heat cycles and one overnighter. About an hour hands on games, then the rest is rubberband on the thumbstick spinning in circles outside in whatever FPS I have laying around."

Yeah, they occasionally crap out. I stress test anything Xbox 360 / PS3 or newer like that regardless of what the problem was. It's too common for BGA issues to be hiding while the console develops other issues. So I replace a laser and then it dies during stress testing and goes back to the workbench for a reball.
Adding to his question, maybe he was also talking about the instances where a reball apparently "solves" the problem but they fail again during the stress test. Is that when you begin suspecting an issue with the chip rather than just under it?

And in those cases, what do you do next?
I see for example you sometimes get other unrepairable boards with failed CELL delid or crap like that. You don't think the chance is high enough to attempt a RSX transplant from one of these questionable boards?

And by the way also, I don't see that much data about systems that don't exactly have the YLOD, but instead present graphical artifacting or GLOD.
Is it simply because they are a bit less common? or because you suspect the chance of a bad chip is very high and therefore prefer going for the YLOD ones first.

I mean, if oxidized pads and such are so common, maybe some of the random "dead" or even heatgunned (god forbid) boards could stand at least a meager chance of containing a not-dead RSX?


I'm just curious seeing your spreadsheet how you arrive at the 'bad chip' dead end on a couple of boards but maybe you have an answer even for those in your trash pile? Admittedly it's probably wise to leave them for the end, and keep working on the more promising and not so Ugly cases first.

Cheers
 
Hi everyone, I'm currently trying to fix my CECHA01 console, managed to use the script just fine

C:\Users\myuser\desktop\ps3syscon>ps3_syscon_uart_script.py com8 CXRF
>$ auth
Auth successful
>$ bringup
bringup
[SSM] state: 0000 -> 0101
Bringup Mode #0 (0xFF)
[SSM] ssmCb_OnStartingBePowOn() called.
[SSM] First Boot.
[SSM] Bringup mode : syspm_stat=00000000/00000000
[POWSEQ] PowerSeq_Setup called.
[SSM] state: 0101 -> 0201
[POWSEQ] AV Backend Setup
[SSM] state: 0201 -> 0102
[SSM] state: 0102 -> 0302
[SSM] PowSeq Fail : Detected !
[SSM] state: 0302 -> 0700
[POWSEQ] AV Backend Letup
[SSM] Shutdown mode : syspm_stat=00000000/00000000
[ERROR]: 0xa0202120
[ERROR]: 0xa0202120
[ERROR]: 0xa0202120
[ERROR]: 0xa0202120
[ERROR]: 0xa0202120
>$ errlog get
[ERROR]: 0xa0202120
[ERROR]: 0xa0202120
[ERROR]: 0xa0202120
[ERROR]: 0xa0202120
[ERROR]: 0xa0202120
[ERROR]: 0xa0213013
[POWSEQ] PowerSeq_Letup called.
[SSM] state: 0700 -> 0600
(PowerOff State) (Fatal)
errlog get
ofst[ 32]:err_code:0xffffffff, clock:0xffffffff
ofst[ 36]:err_code:0xa0202120, clock:0xffffffff
ofst[ 40]:err_code:0xa0202120, clock:0xffffffff
ofst[ 44]:err_code:0xa0202120, clock:0xffffffff
ofst[ 48]:err_code:0xa0202120, clock:0xffffffff
ofst[ 52]:err_code:0xa0202120, clock:0xffffffff
ofst[ 56]:err_code:0xa0202120, clock:0xffffffff
ofst[ 60]:err_code:0xa0202120, clock:0xffffffff
ofst[ 64]:err_code:0xa0202120, clock:0xffffffff
ofst[ 68]:err_code:0xa0213013, clock:0xffffffff
ofst[ 72]:err_code:0xa0202120, clock:0xffffffff
ofst[ 76]:err_code:0xa0202120, clock:0xffffffff
ofst[ 80]:err_code:0xa0202120, clock:0xffffffff
ofst[ 84]:err_code:0xa0202120, clock:0xffffffff
ofst[ 88]:err_code:0xa0202120, clock:0xffffffff
ofst[ 92]:err_code:0xa0202120, clock:0xffffffff
ofst[ 96]:err_code:0xa0202120, clock:0xffffffff
ofst[100]:err_code:0xa0202120, clock:0xffffffff
ofst[104]:err_code:0xa0202120, clock:0xffffffff
ofst[108]:err_code:0xa0202120, clock:0xffffffff
ofst[112]:err_code:0xa0213013, clock:0xffffffff
ofst[116]:err_code:0xa0202120, clock:0xffffffff
ofst[120]:err_code:0xa0202120, clock:0xffffffff
ofst[124]:err_code:0xa0202120, clock:0xffffffff
ofst[ 0]:err_code:0xa0202120, clock:0xffffffff
ofst[ 4]:err_code:0xa0202120, clock:0xffffffff
ofst[ 8]:err_code:0xa0202120, clock:0xffffffff
ofst[ 12]:err_code:0xa0202120, clock:0xffffffff
ofst[ 16]:err_code:0xa0202120, clock:0xffffffff
ofst[ 20]:err_code:0xa0202120, clock:0xffffffff
ofst[ 24]:err_code:0xa0202120, clock:0xffffffff
ofst[ 28]:err_code:0xa0213013, clock:0xffffffff

I get that the HDMI IC is dead and I already purchased a new one, but what about the A0213013 error code? It's sporadic, measuring the CELL Tokins give me a 2.8 Ohm output and the lowest value I got from the caps around was 1.1 Ohm. Is it dead? @db260179 do you know what it means? It's my favorite console and I don't want it to just die and collect dust. Thanks in advance
 
Adding to his question, maybe he was also talking about the instances where a reball apparently "solves" the problem but they fail again during the stress test. Is that when you begin suspecting an issue with the chip rather than just under it?

And in those cases, what do you do next?

Depending on what the original fault was and then how it died, I just revisit every check really quick and poke around a little. If I reballed RSX to fix and it starts artifacting and then goes back to the same YLOD and checks, then yeah, I'm gonna call it.

Since I don't actually repair these FOR people most of the time, I have the luxury of giving up. I'm not gonna spend 20 hours tracking down a fault, swapping out 20 different chips, and checking 1000 components and traces, especially when it's probably just a dead RSX. I sell the parts and move on to better odds. This isn't the system my parents bought me for my wedding that has the only pictures of my dead great grandma on it. I don't have any emotional attachment to it.

I see for example you sometimes get other unrepairable boards with failed CELL delid or crap like that. You don't think the chance is high enough to attempt a RSX transplant from one of these questionable boards?

That last board with the gouged CELL was heatgunned all over. Low odds, not gonna bother. Since I'm logging my results with so many new and insightful tests now, I have started to keep my eyes open for G and H systems with bad disc drives so I can have some known working 90nm RSX to swap around and notch some good results.

And by the way also, I don't see that much data about systems that don't exactly have the YLOD, but instead present graphical artifacting or GLOD.
Is it simply because they are a bit less common? or because you suspect the chance of a bad chip is very high and therefore prefer going for the YLOD ones first.

I buy anything when the price is right, and that includes a lot of eBay listings where the entire description is "doesn't work." I do, but very rarely buy the inventory from a failed repair business full of heatgunned trash. I'd say the results in the spreadsheet are only slightly off from the real, natural average of "guy on the street whose PS3 is broken."

I mean, if oxidized pads and such are so common, maybe some of the random "dead" or even heatgunned (god forbid) boards could stand at least a meager chance of containing a not-dead RSX?

There's not a dead board in my trash pile that doesn't have a reballed RSX (or someone else heatgunned into ash). I would never take the odds of swapping a heatgunned RSX. Even if it wasn't initially destroyed, each rework cycle lowers the odds.

I'm just curious seeing your spreadsheet how you arrive at the 'bad chip' dead end on a couple of boards but maybe you have an answer even for those in your trash pile? Admittedly it's probably wise to leave them for the end, and keep working on the more promising and not so Ugly cases first.

Cheers

At this point, a lot of my decisions on the odds are driven by memory from hundreds of systems. It pretty much comes down to not finding anything else obviously wrong ever after certain checks, and GPU chips fuggin' die these days. No need to continue looking. Hopefully, if I can stow some known good RSX away from G/H systems, it might get a little clearer.
 
Last edited:
ofst[ 24]:err_code:0xa0202120, clock:0xffffffff
ofst[ 28]:err_code:0xa0213013, clock:0xffffffff

I get that the HDMI IC is dead and I already purchased a new one, but what about the A0213013 error code? It's sporadic, measuring the CELL Tokins give me a 2.8 Ohm output and the lowest value I got from the caps around was 1.1 Ohm. Is it dead? @db260179 do you know what it means? It's my favorite console and I don't want it to just die and collect dust. Thanks in advance

Errors indicate hdmi ic has failed, the other error is that CELL chip is having communication issues, its most likely related.

Best start with doing resistance checks and shorts - sometimes these issues are related to caps and resistors failing or shorting. Or even a fuse died. Check voltage from the hdmi ic.
 
Errors indicate hdmi ic has failed, the other error is that CELL chip is having communication issues, its most likely related.

Best start with doing resistance checks and shorts - sometimes these issues are related to caps and resistors failing or shorting. Or even a fuse died. Check voltage from the hdmi ic.
Hey man, thanks for replying. So, C4004 and C4001 are exactly the same specs according to the service manual. But while C4004 gives me around 1.1 ohm, C4001 is giving me over 6k ohm. Fuses are good. Dead C4001 then?
 
Hey man, thanks for replying. So, C4004 and C4001 are exactly the same specs according to the service manual. But while C4004 gives me around 1.1 ohm, C4001 is giving me over 6k ohm. Fuses are good. Dead C4001 then?

So remove that C4001, then do another resistance or cap test on it and on its points. Could be a short further up. You will get to the culprit eventually.

Also do main components remove as a last resort, you will find a simple cap or resistor will be the issue.
 
So remove that C4001, then do another resistance or cap test on it and on its points. Could be a short further up. You will get to the culprit eventually.

Also do main components remove as a last resort, you will find a simple cap or resistor will be the issue.
I cannot thank you guys enough for all the help.. Getting my hopes up again on fixing this one!
 
Got reballed one sem001
bringup
[SSM] state: 0000 -> 0101
Bringup Mode # 0 (0xFF)
[SSM] ssmCb_OnStartingBePowOn() called.
[SSM] Bringup mode: syspm_stat = 00000000/00000000
[POWSEQ] PowerSeq_Setup called.
[SSM] state: 0101 -> 0201
[POWSEQ] AV Backend Setup
[SSM] state: 0201 -> 0102
[SSM] state: 0102 -> 0202
[SSM] state: 0202 -> 0103
[SSM] state: 0103 -> 0203
[SSM] ssmCb_BeforeBeOn() called.
[SSM] state: 0203 -> 0104
Psbd_SbTransMode_Half:0x20e7

eepcsum
Addr:0x000032fe should be 0x1596
Addr:0x000034fe should be 0x86d6
Addr:0x000039fe should be 0x7360
Addr:0x00003dfe should be 0x00ff
Addr:0x00003ffe should be 0x00ff
lasterrlog
Last Error Code:0xa0403034, Time:0xffffffff
Delid, reballed both , did not removed any tokin . Last resort I will eventualy remove and add an group of caps from slims . Any suggestion ?
 
Last edited:
eepcsum

Addr:0x000034fe should be 0x86d6
Addr:0x000039fe should be 0x7360

lasterrlog
Last Error Code:0xa0403034, Time:0xffffffff
Delid, reballed both , did not removed any tokin . Last resort I will eventualy remove and add an group of caps from slims . Any suggestion ?

You need fix your syscon checksum.

so do a write command in cxrf mode
w 0x34fe d6 86
w 0x39fe 60 73
then check with command, 'eepcsum'
Its in my pdf guide
 
Hi All
I need some help
I have repaired dozens of ps3's now using the syscon logs.
I did the usual CXR and set the eeprom 3961 01 00
got into CXRF and instaed of using the command w 39fe 38 00 I stupidly sent w 3961 38 00
turned off and on and no LED showed
Went back into CXR and set the eeprom again to 00,restarted and got the 3 beeps and a flashing LED
tried CXRF and now AUTH does not work and the syscon does not seem to be communicating with me anymore
Any ideas how to fix it?
AUTH is not working in CXR and CXRF now and the RX led does not flash like it used to from the syscon.
helllp!!
 
Hi All
I need some help
I have repaired dozens of ps3's now using the syscon logs.
I did the usual CXR and set the eeprom 3961 01 00
got into CXRF and instaed of using the command w 39fe 38 00 I stupidly sent w 3961 38 00
turned off and on and no LED showed
Went back into CXR and set the eeprom again to 00,restarted and got the 3 beeps and a flashing LED
tried CXRF and now AUTH does not work and the syscon does not seem to be communicating with me anymore
Any ideas how to fix it?
AUTH is not working in CXR and CXRF now and the RX led does not flash like it used to from the syscon.
helllp!!
Motherboard model ?
 
You need fix your syscon checksum.

so do a write command in cxrf mode
w 0x34fe d6 86
w 0x39fe 60 73
then check with command, 'eepcsum'
Its in my pdf guide
Done as well 34fe d6 86, it was corected for 39fe in first place stated with 5 seconds then ylod..Something is wrong with this cpu. Rsx swapped twice same 3 seconds. I have added 12 470uf from slim on each. I think was not necessarily. Seen 1.28v on rsx and 1.12 on cpu, 4 ohms on board and out on cpu, each rsx are fine on measurements on board and out. Time to leave it aside and go further on others motherboard. On phats it seems very low chance to be fixed, do not even worth more than delid, hdmi ic errors. This tool is good enough to understand to stop if I get cpu directly errors apart from something else. I will not go any further on phats then delid and diagnosis on UART.
 
Hi All
I need some help
I have repaired dozens of ps3's now using the syscon logs.
I did the usual CXR and set the eeprom 3961 01 00
got into CXRF and instaed of using the command w 39fe 38 00 I stupidly sent w 3961 38 00
turned off and on and no LED showed
Went back into CXR and set the eeprom again to 00,restarted and got the 3 beeps and a flashing LED
tried CXRF and now AUTH does not work and the syscon does not seem to be communicating with me anymore
Any ideas how to fix it?
AUTH is not working in CXR and CXRF now and the RX led does not flash like it used to from the syscon.
helllp!!

Ok, so you have overwritten 3961 area with an incorrect setting.
So we need to know what ps3 mother board is?

Make sure to disconnect the diag pin lead when going back to CXR mode, you cant auth when diag pin lead is connected.
Try again auth in CXR mode without the diag pin lead. Once we know what ps3 board it is we can correct the checksum, usually they have same values.
 
Done as well 34fe d6 86, it was corected for 39fe in first place stated with 5 seconds then ylod..Something is wrong with this cpu. Rsx swapped twice same 3 seconds. I have added 12 470uf from slim on each. I think was not necessarily. Seen 1.28v on rsx and 1.12 on cpu, 4 ohms on board and out on cpu, each rsx are fine on measurements on board and out. Time to leave it aside and go further on others motherboard. On phats it seems very low chance to be fixed, do not even worth more than delid, hdmi ic errors. This tool is good enough to understand to stop if I get cpu directly errors apart from something else. I will not go any further on phats then delid and diagnosis on UART.

0xa0403034 is data error coming from the RSX, so its possible the data lines are broken, damaged or worn away. Only way to know is take the RSX off again and use a microscope to see any broken lines.
 
Took a untouched board, just delid an swapped rsx with the one from sem board.
Started from first try.
http://s.go.ro/sb13pvd1
What I wanted to see that board it either somewhere/something wrong with cpu, or bump inside board and can not figure further.
Now I am confident that reball process is not my mistake, faulty mobo would be left aside when they give more than 2 days headache.
 
Last edited:

Similar threads

Back
Top