WIP: fizz: soc/intel/pmclib: Fix boot issue after power loss when using USB-C PD#25
WIP: fizz: soc/intel/pmclib: Fix boot issue after power loss when using USB-C PD#25movr4x wants to merge 1 commit into
Conversation
FIZZ boards are unable to boot after power loss if USB-C PD is used to initially power them on when CFR `Restore AC power after loss` (`power_on_after_fail`) is set to: - `Power off (S5)` - Always. - `Previous state` - Only after a clean shutdown. Both options result in Intel chipset after G3 being set by coreboot to keep the device off when power is restored (`Previous state` only after a clean shutdown). The issue is likely related to some sort of miscommunication between chipset and ChromeEC, perhaps due to USB-C PD induced delays (barrel jack not affected), where ChromeEC fails to bring the system up and gives up. Since this affects EC RO there is no safe way of mitigating this issue from EC side. The solution is to always make sure that chipset after G3 is set to `auto on`, and let EC handle the rest if EC manages after G3. Signed-off-by: Lukasz Kutyla <luk.kutyla@gmail.com>
|
To proceed with this further and mark this as ready I will need your input on this, so when you find some spare time please take a look at this. |
so this doesn't make sense to me. If the problem were on the EC side, then the SoC register wouldn't matter, since we wouldn't get to S0. I think we need to figure out why the SoC is deciding not to power up the unit, rather than working around it |
|
Well, I have tried to replicate this using my own USB-C PD power supply, which I have ordered recently, and it works just fine for me with Intel chipset set to keep the device off. Confirmed with cbmem: I went for a reputable brand, Chicony. Model My SION always boots fine when I press the power button after forced G3. It does take about 5-6 seconds for the device to bring itself up the first time though. I am beginning to wonder if the issue is related to some sort of botched PD implementation on the power supply side for cases where it fails (still points to EC as the problem).
The #1 reason why EC is the suspect here is because barrel jack works fine. By fine I mean that it renders SoC after G3 register configuration useless, which is what you would expect to happen when EC is designed to force chipset to do what EC wants, which is the case for ChromeEC. First let's start with how this Intel register is designed. For Skylake/Kabylake (newer gens might use different location, but the rest should be similar): So coreboot can either:
And something like this is always being setup by coreboot in Intel/AMD SoC code. If you set CFR to In ChromeEC you can find stuff like:
This works as expected for both barrel jack and USB-C, no auto power up at all, regardless of chipset after G3 configuration. USB-C also requires PD negotiation to power the chipset/CPU, so without a green flag from the EC side, which does the heavy lifting here, the chipset/CPU cannot start on it's own. If you set CFR to File When image is executed it initializes: During initialization stage a hook ensures that initial power button state can be updated: What happens here is:
Then state machine, if Functions related to emitting power button press to PCH: This allows ChromeEC to force the device to power up even with chipset register being set to keep the device off - power button acts as a valid wake event. When you reconnect power to the device, and it auto powers on, you can see from cbmem for Intel PM section: Specifically: Informs that the device was powered on because of power button wake event, even if you did not press it (EC did) The way I see it:
So when this happens the system very likely does not reach S0 at all and is stuck between "fake G3" and S5. While it is possible that the chipset is doing something weird here, I would say everything points to EC fault, since EC has all the tools it needs to force the chipset to power the device to S0. Are you able to replicate this issue on your TEEMO, like the other two users? |
FIZZ boards are unable to boot after power loss if USB-C PD is used to initially power them on when CFR
Restore AC power after loss(power_on_after_fail) is set to:Power off (S5)- Always.Previous state- Only after a clean shutdown.Both options result in Intel chipset after G3 being set by coreboot to keep the device off when power is restored (
Previous stateonly after a clean shutdown).The issue is likely related to some sort of miscommunication between chipset and ChromeEC, perhaps due to USB-C PD induced delays (barrel jack not affected), where ChromeEC fails to force the chipset to bring the system up and gives up. Since this affects EC RO there is no safe way of mitigating this issue from EC side.
This was reported by users:
MrChromebox/firmware#595 (comment)
The solution is to always make sure that chipset after G3 is set to
auto on, and let EC handle the rest if EC manages after G3.The usual flow for configuring chipset after G3 in coreboot for Intel is as follows:
Intel common PMC Kconfig (
src/soc/intel/common/block/pmc/Kconfig) selects:It also sets:
HAVE_POWER_STATE_AFTER_FAILUREcauses mainboard Kconfig (src/mainboard/Kconfig) to defineMAINBOARD_POWER_FAILURE_STATE, either by a default pathway, or by custom selection:MAINBOARD_POWER_FAILURE_STATEis then used as a default for configuring chipset after G3, which in the case of Intel can be then additionally changed via CFRpower_on_after_fail(src/soc/intel/common/block/pmc/pmclib.c):AMD appears to follow a similar pattern, except it always directly uses
MAINBOARD_POWER_FAILURE_STATE(no CFR) (src/soc/amd/common/block/pm/pmlib.c):Some of the possible solutions:
1) Keep EC after G3 configuration coupled to chipset after G3 configuration and make Intel chipset after G3 value be condition based - this is the starting point for this PR.
We define some sort of helper to indicate that EC handles after G3, for example in mainboard Kconfig (
src/mainboard/Kconfig):HAVE_EC_POWER_STATE_AFTER_FAILUREcan be then selected by:config EC_GOOGLE_CHROMEEC- global for all boards using specific EC, where EC sits in front of managing after G3.config EC_GOOGLE_CHROMEEC_AFTER_G3_STATE(or other custom after G3 implementation) - local for boards which explicitly enable custom way of altering EC after G3.Intel (and optionally AMD) PM code is changed to always set chipset after G3 to
auto onif EC handles after G3:Since at the moment the issue appears to affect only Intel, only this platform is targeted, as a starting point. Also,
HAVE_EC_POWER_STATE_AFTER_FAILUREis only enabled for boards which use After G3 State implementation to keep this fix local. This can be changed of course if you find it necessary.Pros:
auto onif EC handles after G3, regardless of CFR/Kconfig choice.power_on_after_failcan be re-used by custom implementations that allow to override EC after G3 - one CFR/Kconfig for both chipset and EC after G3..Cons:
System Power State after Failure, and optionally CFR, which could be problematic, as changing it will have no effect (EC will override this) and could even cause trouble like the one this PR attempts to fix - the latter could be mitigated by makingHAVE_EC_POWER_STATE_AFTER_FAILUREglobal to all boards where EC manages after G3 (like ChromeEC).2. Complete decoupling of EC after G3 configuration from chipset after G3 configuration.
We define stuff that is used to configure EC after G3 (including separate prompt), for example in mainboard Kconfig (
src/mainboard/Kconfig):In the same file we make the
System Power State after Failureprompt be only shown if there is noHAVE_EC_POWER_STATE_AFTER_FAILURE:In the same file we also make sure
MAINBOARD_POWER_FAILURE_STATEis always set toauto onifHAVE_EC_POWER_STATE_AFTER_FAILUREis set:In something like
src/ec/pmlib.h(equivalent for pmclib/pmlib in Intel/AMD) we define enum:In something like
src/ec/cfr.hwe define EC specific after G3 CFR:In EC Kconfig, if EC controls after G3, like ChromeEC, we select:
In custom implementations, where we have the ability to change EC after G3, we select values that are supported, and optionally the default value (ChromeEC, by default, officially uses
previous state):Then for boards which use EC to control after G3 (like ChromeEC) we replace CFR
power_on_after_failwithec_power_on_after_fail.Pros:
auto onif EC handles after G3.Cons:
power_on_after_failreference, which can still be altering chipset after G3. (this could be mitigated by forcing global Intel/AMD PM code to always useauto onlike in the first solution).defaultvalue.NOTE
In both cases we still end up with PM code emitting
printk(BIOS_INFO, "Set power on after power failure.\n"), which can be confusing to some users, but is technically valid, as chipset after G3 is set toauto on.