It actually seems like the power eject board was a lucky coincidence. The thermal 2030-2033 errors were related to resistors that weren't making proper contact ( R4030 R4031 R4083 R4084) . As it goes through them to the thermal ICs.
But now I've got another unrecoverable error 3003. It seems to indicate a power fail, no more info given. The board has been through too much abuse. I installed the original syscon back, but with the wrong temperature profile. So some components around it didn't like it. I have checked them and replaced some of them, but this time the error won't go away. I tracked it down to IC6103 (CPU VRM controller) where pin7 is 0v (no power good), and no Enable signal at pin 29. Pin22 (VCC) gets 12v.
At this point I got a bit lost so maybe this board can be set aside for spares. But ofc,
@RIP-Felix , since you've got the knack for studying schematics, feel free to give some pointers...
Check 3.3v_MISC. Start with PS6001 and the components around Q6001 to ensure 12v --> 3.3v. I doubt there is an issue there because I have another hypothesis that seems to make sence with where a 3003 is placed on the SYSCON error code list. So far we know 3004 is an RSX VDDC Power failure. I'm thinking 3003 is a BE VDDC power failure. I triggered a VDDC RSX failure while removing tokins from the RSX. I never triggered a power failure when removing the tokins from the CELL side. However, had I removed more of them, I might have. Engineers like to cluster related errors, since the CPU is always enumerated before the RSX, I think it stands to reason the error code just before the RSX VDDC power failure would be the same error for the CPU!
So just be sure 3.3v_MISC is good at it's source. Then trace 3.3v to the CPU's VRM controller (IC6103). It needs 3.3v to enable (Pin 29) and Power good (Pin 7), along with VID pins 0-5, otherwise the controller cannot enable the switching (IOR) VRM that provide VDDC to the CELL_BE.
First thing I notice from looking atthat controller's datasheet is a 60s maximum above 183-Deg C thermal rating. Exposure to reflow temps for lead free at 230-260C for longer than 60s can reduce it's reliability. Given the numerous reworks cycles, that can be an issue. You could try replacing it if all else fails.
Otherwise, check R6103 against a good board. It's supposed to be 1500 (+/-)7.5 Ohms. There are more of these resistors along that voltage rail and they could be easily cooked out of spec from reflow temps. Since R6103 is attached to the Power good pin, if it's out of spec, that could cause the voltage to fall below PG threshold (whatever it's set to). @1500 Ohms the voltage is 3.3v, at 1400 ohms it'd fall to 3.08v. I suspect they don't want it to fall below 3.25v. That means the resistance between a good board and this one cannot be off by more than 22 Ohms (for example only). The true number depends on what they set the power good threshold to. You'd have to inject 3.3v to the voltage rail on a good board, then slowly lower it until it triggers an error. Then we'd know what the threshold is.
Check IC6107 (CMOS) and the immediate area for damage. Compare voltages/resistance against a good board. Also check C6119 and R6119, if either of those are shorting it'll pull the enable pin low. It needs to be pulled high to enable the CMOS Out_Y, which enables the IOR Switching VRM (for CPU VDDC). Check C6113, if it's shorting it'll probably blow that 3.3v fuse protecting the Power Supply (PS6001). Just double check you are getting 3.3v on one side of it and GND on the other. All of these 1500 ohm resistors, also on VID pins 0-5, are close to the CPU region and could be easily damaged by reflow/reball/tokin repair attempts.
Speaking of tokins, if they are totally shot, this error could be due to any SMD in the VDDC circuit. However, that's all down stream from the controller. Since you're not getting 3.3v at the controller that initiates everything downstream, the filter is probably fine. I'm just bringing it up to be thorough. Another part of the filter no one checks, is the first stage RC filter (R6155/C6137, R6154/C6136, & R6153/C6135). These values are very important for the "tuning" of the second stage RC filter (Tokins). If the first stage is significantly off, the noise could be so bad it might trigger a power failure outright, skipping the 1002's altogether. You can compare their resistance value to that of a good board.
Long story short, it's complicated! You could start by opening the schematic and listing all the SMD's immediately surrounding IC6003, IC6107 (unless 3.3v_MISC is fine) and Q6001. Then, print a close-up of the schematic so you know which is which when probing. Then record the resistance values of everything on the list for both a known good board and your bad board. Then compare them to see what's different. Your looking for values that are significantly off. This will give you clues to what "might" be wrong. Or at least where to look and start voltage testing/injecting. Or it might narrow it down to the IC itself, which will give you more certainty before replacing it.