Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandad: reset safety mode on exit #32103

Draft
wants to merge 22 commits into
base: master
Choose a base branch
from
Draft

pandad: reset safety mode on exit #32103

wants to merge 22 commits into from

Conversation

adeebshihadeh
Copy link
Contributor

allows for restarting the device while onroad without faulting the car

selfdrive/boardd/boardd.cc Outdated Show resolved Hide resolved
@adeebshihadeh adeebshihadeh added this to the 0.9.7 milestone Apr 5, 2024
Copy link
Contributor

github-actions bot commented May 6, 2024

This PR has had no activity for 30 days. It will be automatically closed in 7 days if there is no activity.

@github-actions github-actions bot added the stale label May 6, 2024
Copy link
Contributor

This PR has been automatically closed due to inactivity. Feel free to re-open once activity resumes.

@github-actions github-actions bot closed this May 13, 2024
@adeebshihadeh adeebshihadeh reopened this May 13, 2024
@github-actions github-actions bot removed the stale label May 14, 2024
@adeebshihadeh adeebshihadeh marked this pull request as draft May 22, 2024 00:25
@sshane sshane force-pushed the boardd-no-fault branch from fce03c2 to 01b598e Compare May 30, 2024 09:11
@sshane sshane changed the title boardd: reset safety mode on exit pandad: reset safety mode on exit Jun 6, 2024
count = 0
first_run = True
params = Params()
no_internal_panda_count = 0

while True:
while not do_exit:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior matches pandad.cc's of looping until it gets a SIGTERM signal and then gracefully exiting.

Comment on lines 66 to 77
# override manager's immediate SystemExit to allow pandad to exit gracefully
def signal_handler(signum, frame):
nonlocal do_exit
do_exit = True
cloudlog.warning(f"caught signal {signum}, exiting")

do_exit = False
signal.signal(signal.SIGTERM, signal_handler)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When creating a new Python process with multiprocessing.Process, it inherits the parent's signal handler. For manager, that is to cause a SystemExit immediately, which SIGKILLs pandad.cc before it has a chance to react to the SIGTERM and close the relay.

# SystemExit on sigterm
signal.signal(signal.SIGTERM, lambda signum, frame: sys.exit(1))

sshane added 3 commits June 6, 2024 03:58
old-commit-hash: 5094960
old-commit-hash: 5f24167
old-commit-hash: 1664113
@sshane sshane marked this pull request as ready for review June 6, 2024 11:09
@sshane
Copy link
Contributor

sshane commented Jun 6, 2024

@adeebshihadeh we have two options:

  • have manager send SIGTERM to all processes and wait a predetermined amount of time before issuing reboot (pandad.py would need to forward the signal to pandad.cc)
  • or change systemd service KillSignal to SIGTERM, followed up with a SIGKILL later on. pandad.cc is also seeing a SIGHUP despite the docs saying the default is to not send it, so we also need to handle that (just add one line in util.h)

@sshane sshane modified the milestones: 0.9.7, 0.9.8 Jun 6, 2024
@sshane sshane force-pushed the boardd-no-fault branch from f3fa8e8 to d07aeaa Compare June 8, 2024 04:13
@adeebshihadeh adeebshihadeh marked this pull request as draft June 12, 2024 17:21
Copy link
Contributor

This PR has had no activity for 14 days. It will be automatically closed in 3 days if there is no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants