Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recovery: "mdb -k" should be at least basically functional #182

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

citrus-it
Copy link
Contributor

@citrus-it citrus-it commented Dec 17, 2024

This costs us around 3MiB so seems well worth it.

@bcantrill
Copy link
Contributor

This looks great! Given the presence of the shared objects, it feels like the p tools + gcore are also reasonable to add?

A crude list:

gcore
pargs
pauxv
pcred
penv
pfiles
pflags
pgrep
pkill
pldd
plimit
pmap
ppriv
prctl
preap
prstat
prun
psig
pstack
pstop
ptime
ptree
pwdx

These add up to less than 1MiB, and it feels like it would give us some pretty rich tooling. If libdtrace is already there (somewhat unlikely?) than it's probably also worth throwing dtrace into the mix (but not if not).

@citrus-it
Copy link
Contributor Author

We're still at 110MiB (was 107MiB before this PR) with those additional tools.
I haven't tested this yet, but will do that tomorrow.

@citrus-it
Copy link
Contributor Author

With these changes, mdb -k passes a smoke test:

EVT22200006 # mdb -k
Loading modules: [ unix genunix specfs dtrace mac cpu.generic apix cpc mm zfs sata ip hook neti sockfs scsi_vhci ]
> ::stacks -m zfs
THREAD           STATE    SOBJ                COUNT
fffff7880a2a2c20 SLEEP    CV                      5
                 swtch+0x139
                 cv_wait+0x70
                 zthr_procedure+0x57
                 thread_start+0xb

fffff7880a187c20 SLEEP    CV                      2
                 swtch+0x139
                 cv_timedwait_hires+0xd7
                 cv_timedwait+0x52
                 txg_thread_wait+0x48
                 txg_sync_thread+0xf7
                 thread_start+0xb

gcore also works:

EVT22200006 # gcore $$
gcore: core.635 dumped
EVT22200006 # ls
core.635

@citrus-it citrus-it requested a review from bcantrill December 18, 2024 11:44
@papertigers
Copy link

papertigers commented Dec 18, 2024

Does adding nvmeadm (libnvme) or maybe diskinfo instead add much more on? I know from yesterday's debugging call we did not have a good way to list the disks on the stuck system.

Edit:
Or maybe the right mdb modules to at least list those devices if it's cheaper.

@citrus-it
Copy link
Contributor Author

Does adding nvmeadm (libnvme) or maybe diskinfo instead add much more on? I know from yesterday's debugging call we did not have a good way to list the disks on the stuck system.

I've added both nvmeadm and diskinfo and confirmed they work. The final total cost of this PR is:

--rw-r--r--   1 andy     staff    113157199 Jan  8 17:04 /home/andy/installinator/zfs.img
+-rw-r--r--   1 andy     staff    116934599 Jan  8 16:58 /home/andy/installinator/zfs.img

which at just under 4MiB, seems very reasonable for the additional tooling we get in case of problems - just under 30s extra transfer time via IPCC.

@citrus-it citrus-it requested review from jgallagher and leftwo January 8, 2025 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants