You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Gimlet the host flash driver manages an analog mux to determine whether we are booting from QSPI 0 or QSPI 1. This is the HostFlash.set_dev RPC as opposed to the HostFlash.set_mux RPC. When we perform an update of the host flash we are generally writing to the slot opposite of the one we're booting from (but not necessarily). This will cause a switch to device that will remain until another switch happens or the SP restarts and we read from the in-memory state.
Consider the following sequence:
control-plane-agent switches the dev from QSPI 0 to QSPI 1
control-plane-agent is being sent an image to write to the SPI flash in chunks
The host OS reboots intentionally or unintentionally (e.g. panic, MAPO, etc.)
At this point in time we will end up in a spot where the host will end up getting an incomplete image from QSPI 1. We probably want to keep track of two logical bits in the host flash driver:
What is the image I intend to boot from?
What is the image that I intend to write to?
Generally once the host successfully boots to the point of issuing an IPCC command we will mux the QSPI flash back to the SP at which point something like this could continue. My assumption is that during an online update of the control plane we will end up wanting to write and stage this update, but the host will be active meaning that we may also want to make sure that we have a clear error message that it is muxed to the host and how long it has been since then so the control plane can deal with that. I expect parts of that already exist.
The text was updated successfully, but these errors were encountered:
rmustacc
changed the title
Need to separate out active slot for writing from active slot for booting
Need to separate out host flash active slot for writing from active slot for booting
Dec 16, 2024
In Gimlet the host flash driver manages an analog mux to determine whether we are booting from QSPI 0 or QSPI 1. This is the
HostFlash.set_dev
RPC as opposed to theHostFlash.set_mux
RPC. When we perform an update of the host flash we are generally writing to the slot opposite of the one we're booting from (but not necessarily). This will cause a switch to device that will remain until another switch happens or the SP restarts and we read from the in-memory state.Consider the following sequence:
At this point in time we will end up in a spot where the host will end up getting an incomplete image from QSPI 1. We probably want to keep track of two logical bits in the host flash driver:
Generally once the host successfully boots to the point of issuing an IPCC command we will mux the QSPI flash back to the SP at which point something like this could continue. My assumption is that during an online update of the control plane we will end up wanting to write and stage this update, but the host will be active meaning that we may also want to make sure that we have a clear error message that it is muxed to the host and how long it has been since then so the control plane can deal with that. I expect parts of that already exist.
The text was updated successfully, but these errors were encountered: