Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gmake setup / helios-build doesn't rebuild pilot (in some cases?) #110

Open
jgallagher opened this issue Jul 27, 2023 · 1 comment
Open

Comments

@jgallagher
Copy link
Contributor

jgallagher commented Jul 27, 2023

I built an OS image on atrium (which is on heliosv2) today, using the omicron shell script, which wraps running gmake setup and then helios-build experiment-image .... After installing the OS on madrid, it had a hostname of unknown; the logs for the compliance/hostname service revealed the cause:

++ pilot gimlet info -i
ld.so.1: pilot: fatal: libssl.so.1.1: open failed: No such file or directory

Looking in my helios working directory, the pilot binary is quite old (~2 weeks) and predates the heliosv2 upgrade:

john@atrium ~/helios $ ls -l projects/pilot/target/release/pilot
-rwxr-xr-x   2 john     staff    47684680 Jul 12 19:06 projects/pilot/target/release/pilot
john@atrium ~/helios $ elfdump -d ./projects/pilot/target/release/pilot | grep ssl
       [1]  NEEDED            0x6c6dd4            libssl.so.1.1

I expect I can work around this by manually cleaning projects/pilot (or probably safer: blowing away my helios and starting fresh?).

@jclulow
Copy link
Collaborator

jclulow commented Jul 27, 2023

Yeah this is a somewhat unfortunate artefact of the way the OpenSSL dependency is determined, for at least two reasons:

  • we're building pilot today during gmake setup, which is appropriate for build tools that run on the build host like the image construction stuff, but not always appropriate for things that we ship to run on the target system; more concretely:
    • bad: if the shipped binary uses a private interface
    • bad: if the shipped binary uses a public, but unstable, interface, as is effectively what happened here with OpenSSL: we did a hard break from 1.1 to 3.0 and we dropped the old library to avoid accidentally continuing to use it, etc
    • ok: if the shipped uses only public interfaces, and the built image is using the same OS bits as the build machine, or will be using newer bits, then things are generally OK because we make strong backwards compatibility guarantees in the OS (it is hard to know if you are only using public interfaces at a glance of course)
  • even if we were rebuilding pilot against the assembled ramdisk root (or at least a sysroot with analogous packages, including the headers and compilation links that we chuck out of the ramdisk itself, etc) I'm not sure cargo build would have noticed that the OpenSSL version changed, because I think it gets cached by the build.rs business in the crate with the bindings; this implies we would have to always cargo clean which is pretty unfortunate as it would explode the time taken to create an image significantly

Fortunately, I believe there is another way! For files that are built during the OS build, or for any number of other third party components (e.g., PostgreSQL libraries) we enumerate and record dependencies when packaging up the resultant files. This enables us to decouple parts of the build: we can built pilot binaries once, that used the correct OpenSSL packages, and then publish them into the repository. When installed in the ramdisk, we'll pull in the correct OpenSSL, or fail to assemble the image because it cannot be pulled in. Another benefit is that gmake setup will take less time because we don't need to rebuild all the things all the time.

I will look at moving the pilot build into something we can shove into the package repository and adjusting the process here. In the meantime, if you blow away your entire helios and start fresh that will definitely get you into a better place after the Helios 2.0 switch yes.

Apologies for the mess!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants