-
-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regex_matches baseline check fails due to rev-parse changes in git 2.47.0 #1622
Comments
Per actions/runner-images#10636, the `ubuntu-latest` GitHub Actions hosted runner is being migrated from `ubuntu-22.04` to `ubuntu-24.04`. The `ci.yml` workflow, including the full `test` job, use `ubuntu-latest`, and recently has switched over to using Ubuntu 24.04 runner. It makes sense to use this newer version, but that runner apparently does not preinstall the headers required to build with `-llzma`. That causes `just ci-test` to fail at `cargo nextest run -p gix-testtools --features xz` with `/usr/bin/ld: cannot find -llzma: No such file or directory`. This installs the `liblzma-dev` package that provides the required headers, so we can use Ubuntu 24.04 LTS while continuing to test the `xz` feature of `gix-testtools`. The failure this addresses was first observed in GitoxideLabs#1618 at 64e0f78: https://github.com/GitoxideLabs/gitoxide/actions/runs/11304216949/job/31442302018?pr=1618 But it is not caused by any of the changes there, and it now occurs whenever the `test` job is run, including if it is re-run at the tip of the main branch (or any other branch), such as in: https://github.com/EliahKagan/gitoxide/actions/runs/11317510844/job/31471040361 In both cases, expanding Set up job > Operating system shows Ubuntu 24.04.1 LTS. (The failure this addresses is also not related to GitoxideLabs#1622, which documents an unrelated failure that is possible to observe locally and that, if no change is made related to it, can be expected to occur on CI in the `test` starting sometime soon, but that is not yet seen on CI.)
Per actions/runner-images#10636, the `ubuntu-latest` GitHub Actions hosted runner label is being migrated from aliasing `ubuntu-22.04` to aliasing `ubuntu-24.04`. Our `ci.yml` workflow, which includes the full `test` job, uses `ubuntu-latest`, and recently has switched automatically to using an Ubuntu 24.04 runner. It makes sense to use this newer version, but that runner apparently does not preinstall the headers required to build with `-llzma`. That causes `just ci-test` to fail at `cargo nextest run -p gix-testtools --features xz` with `/usr/bin/ld: cannot find -llzma: No such file or directory`. This commit installs the `liblzma-dev` package that provides the required headers, so we can use Ubuntu 24.04 LTS while continuing to test the `xz` feature of `gix-testtools`. The failure this addresses was first observed in GitoxideLabs#1618 at 64e0f78: https://github.com/GitoxideLabs/gitoxide/actions/runs/11304216949/job/31442302018?pr=1618 But it is not caused by any of the changes there, and it now occurs whenever the `test` job is run, including if it is re-run at the tip of the main branch (or any other branch), such as in: https://github.com/EliahKagan/gitoxide/actions/runs/11317510844/job/31471040361 In both cases, expanding *Set up job* > *Operating system* shows Ubuntu 24.04.1 LTS. (The failure this addresses is also not related to GitoxideLabs#1622, which documents an unrelated failure that is possible to observe locally and that, if no change is made related to it, can be expected to occur on CI in the `test` starting sometime soon, but that is not yet seen on CI.)
Thanks a lot for the detailed report! It is my hope that once all the Windows tests are working reliably, the final step to mostly deterministic CI runs is to proactively test for changes on CI with Looking at the graph itself, it's notable that Git v2.46.2 find …but v2.47.0 finds This probably has to do with a change in the underlying traversal order, and I vaguely remember that traversing the first parents first can be done as performance optimisation. I think I have seen that in the commit-graph traversal employed by merge-base searches. On the other hand, I also wouldn't be surprised if this is an unintended side-effect of another change to the way Git performs traversals. Regarding the failing tests, it's obviously very dependent on the traversal order of the commit-graph even though that isn't what's under test. So I'd think that could be fixed by adjusting the commit messages to be unique enough to only allow one possible match. |
Isn't part of the purpose of these particular tests to check these kind of situations where traversal order matters? That is my impression because it seems like the comment here is documenting intent:
Or is the idea that we should no longer attempt to test that? |
That's indeed great evidence that this was intended all along. Thinking about it more, it does make sense as It's a bit strange to see this was changed so fundamentally in Then I don't know what to do, as I do find the previous behaviour more intuitive. Maybe we just keep this open and wait, maybe it disappears as sudden as it appeared. |
Do you also want macOS runs on CI with
I agree that we should not remove or weaken the tests and that waiting might lead to a further update to Git that reverses the behavior. It may be useful to prepare for CI breakage by making it so that the test suite automatically skips the affected test case when running on CI with git >=2.47.0. How best to go about that--and also when--may depend on the above macOS question. (I encountered this issue when working on something else in gitoxide that is conceptually unrelated, and I'm in no hurry to do anything potentially unnecessary on this issue, so if the best way forward right now is just to wait, that is fine by me.) |
Doing something depending on the version of Git could be interesting to It's a bit sad that once it's skipped, there is no way to easily learn if it still works - so maybe the skip should be implemented so that it will also test for the opposite case - "it's skipped, but the test actually succeeds, fix it". |
If you mean having gitoxide's own search behavior vary based on what version of Git is installed, I recommend against doing that by default, because:
These points relate indirectly (and in the case of the For the question of whether the gitoxide behavior related to this issue should match that of git 2.46.2 or git 2.47.0, I think that, unless this change goes away in a git 2.47.* patch release, we should eventually figure out what has caused the change, and whether it is intentional:
I can eventually look further into how this change has arisen in Git. If I still cannot figure it out, then I could inquire on the Git mailing list. This specific research is not my immediate main priority for gitoxide-related research, but it is something I am willing to do eventually, if the old behavior is not restored in the mean time by a patch release.
Yes, I was thinking of doing something like that soon. I think the subexpression that fails in this way--and the others that would fail if it didn't--can be extracted and modified so that they do the check without a panic. If not, then the baseline check implementation--which is also part of the test suite--could be modified to allow this. Then if the baseline checks triggered by the However:
|
No, that's (fortunately) not what I meant. The idea was to use this to selectively alter the expected outcome of the failing test, while providing this feature to
Thank you! I agree that there are more important things to do and that this regression can be taken care of later if it doesn't resolve itself.
I think so, too.
Let's wait, I think when this is implemented, if at all, decent solutions will be straightforward enough, and they can take different shapes independently of the first ideas shared here even. |
Sounds good. If I notice that CI is immediately about to fail for this reason, then I'll make note of that. Otherwise, I'll wait until I observe it does actually fail. (I suggest keeping this issue open until either there is a lasting change in gitoxide that fixes it or the changed Git behavior has gone away.) |
This temporarily suppresses baseline comparisons for two patterns where the baseline checks fail if the `complex_graph` fixture script was run in Git 2.47.0. For now, they are unconditionally suppressed, regardless of the Git version or whether generated archives are used. This change is specific to the `find_youngest_matching_commit::regex_matches` test. This works around, but does not fix, issue GitoxideLabs#1622. The affected test is still run, but the two directly affected baseline checks are skipped.
On the CI linux workflow, archived baseline aren't used - instead fixtures will be re-evaluated. As of Git 2.47, its behaviour changed which makes the following assertion fail. We decided to just ignore it until it's clear that this isn't a bug - obviously the traversal order changed. ---- This temporarily suppresses baseline comparisons for two patterns where the baseline checks fail if the `complex_graph` fixture script was run in Git 2.47.0. For now, they are unconditionally suppressed, regardless of the Git version or whether generated archives are used. This change is specific to the `find_youngest_matching_commit::regex_matches` test. This works around, but does not fix, issue #1622. The affected test is still run, but the two directly affected baseline checks are skipped. Co-authored-by: Eliah Kagan <[email protected]>
On the CI linux workflow, archived baseline aren't used - instead fixtures will be re-evaluated. As of Git 2.47, its behaviour changed which makes the following assertion fail. We decided to just ignore it until it's clear that this isn't a bug - obviously the traversal order changed. ---- This temporarily suppresses baseline comparisons for two patterns where the baseline checks fail if the `complex_graph` fixture script was run in Git 2.47.0. For now, they are unconditionally suppressed, regardless of the Git version or whether generated archives are used. This change is specific to the `find_youngest_matching_commit::regex_matches` test. This works around, but does not fix, issue #1622. The affected test is still run, but the two directly affected baseline checks are skipped. Co-authored-by: Eliah Kagan <[email protected]>
On the CI linux workflow, archived baseline aren't used - instead fixtures will be re-evaluated. As of Git 2.47, its behaviour changed which makes the following assertion fail. We decided to just ignore it until it's clear that this isn't a bug - obviously the traversal order changed. ---- This temporarily suppresses baseline comparisons for two patterns where the baseline checks fail if the `complex_graph` fixture script was run in Git 2.47.0. For now, they are unconditionally suppressed, regardless of the Git version or whether generated archives are used. This change is specific to the `find_youngest_matching_commit::regex_matches` test. This works around, but does not fix, issue #1622. The affected test is still run, but the two directly affected baseline checks are skipped. Co-authored-by: Eliah Kagan <[email protected]>
In the preceding two commits, the make_rev_spec_parse_repos fixture was modified to avoid giving extra executable permissions to a loose object file where they are not needed, and the affected fixture archive was regenerated. Though the permissions change is itself good and causes no problems, the overall change caused two problems, which are corrected here: 1. I had taken the opportunity to follow better practices when running commands in a shell script whose arguments are formed by parameter expansion: adding quoting where splitting and globbing is not intended but could in principle also be indicated; and preceding the argument formed this way with a `--` to designate it clearly as a non-option argument, since `chmod` follows the XBD Utility Syntax Guidelines, which include `--` recognition. While adding quoting was a good change (in this case, just for clarity that no expansions are intended), the way I added `--` created a new problem where none had existed. This is because I wrongly thought of it as separating non-filename arguments from filename arguments, which is incorrect: in `chmod`, a mode argument is neither an option or an operand to an option. Accordingly, only some implementations of `chmod` allow it to be placed after the mode. This commit corrects that by placing it before the mode argument instead, which is portable while still achieving the goal of establishing the argument after it as as never being meant to be interpreted as an option (regardless of whether the system's `chmod` recognizes options after non-option arguments). 2. Due to GitoxideLabs#1622, regenerating `make_rev_spec_parse_repos.tar` with Git 2.47.0 causes revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches to to fail on all systems with all versions of Git, whenever `GIX_TEST_IGNORE_ARCHIVES` is *not* set. This differs from the usual situation where it fails only when that *is* set and only when the available Git is >= 2.47.0. This causes the test to fail in the `test-fast` CI job, since the mitigation in GitoxideLabs#1635 for when the tests are detected to be running on CI deliberately covers only the `GIX_TEST_IGNORE_ARCHIVES` case. In the previous commit, I had regenerated that archive on an Ubuntu 24.04 LTS system with Git 2.47.0 installed from the git-core PPA, causing this problem. This commit regenerates the archive again on a macOS 15.0.1 system with Git 2.39.5 (Apple Git-154), using the command: TZ=UTC cargo nextest run --all --no-fail-fast All tests passed and the archive was successfully remade. I used `TZ=UTC` since I usually regenerate archives on a system whose time zone is configured to be UTC rather than local time, and more specifically because there is an unrelated bug (to be separately reported) causing an unrelated test to fail in some time zones in the two weeks that follow daylight saving time adjustments.
The causeAarni Koskela has found and reported a bug in Git that I am pretty sure is the cause of this issue:
Patrick Steinhardt has identified the cause of that bug and submitted a patch, noting:
The patch by Patrick Steinhardt is currently at v3. VerificationProcedureTo verify that the baseline test failure described in this gitoxide issue is due to that Git bug, I:
Detailed resultsFull output can be seen in this gist. The interesting portion, where the only difference is whether
Proposed planMy guess is that the Git bugfix may make it into 2.47.2. Once a fixed version of Git makes it into the GitHub Actions runners, the workaround in #1635 can be reverted. Most versions of Git will have been unaffected and this only affects the baseline tests (the behavior of gitoxide itself is not implicated), so nothing should need to remain in place to try to support affected Git versions. When the time comes, I will do a cursory check to see if it looks like any major operating systems have downstream versions of Git 2.47.0 or 2.47.1 (or others, if the fix comes in later than I expect) that have the bug and that may not be patched to fix it (for non-rolling releases, such a fix would not necessarily be integrated, since the bug is not severe or a vulnerability). If so, then maybe having a comment in the test will be worthwhile. Otherwise, even that should be unnecessary. |
Thanks again for staying on the pulse of the Git project to learn about the pending fix, and of course for validating ahead of time that it will indeed fix the issue. When looking at #1635 I was also reminded that you also predicted that this issue would happen in the first place - truly proactive! Now it would be good if one could monitor the 'bill of software' of all images to know when they update, but I also think it's enough to return to this on a timer.
I think a note to your comment here or similar hints that a patch is inbound would be useful. That way, even if reverting the changes is forgotten, there is a chance one will find the comment upon review. Please do feel free to submit a PR in any way you see fit. |
This updates the comment about GitoxideLabs#1622, including by adding links to Git mailing list posts by Aarni Koskela, who discovered the bug that turns out to be the cause of this, and Patrick Steinhardt, who analyzed the bug and wrote a fix (currently in testing), as well as GitoxideLabs#1622 (comment), which summarizes that and reports on its connection to GitoxideLabs#1622. This also narrows partial suppression of the failing test code (which consists of conditionally using `parse_spec_no_baseline` instead of `parse_spec` in some assertions) so that it is only done if Git is at one of the versions that is known to be affected. If any future Git versions are affected, such as by the currently cooking patch not being merged as soon as I expect, then this could potentially fail on CI again. But that is something we would probably want to find out about.
This builds on GitoxideLabs#1635 by: - Updating the comment about GitoxideLabs#1622, including by adding links to Git mailing list posts by Aarni Koskela, who discovered the bug that turns out to be the cause of this, and Patrick Steinhardt, who analyzed the bug and wrote a fix (currently in testing); and GitoxideLabs#1622 (comment), which summarizes that and reports on its connection to GitoxideLabs#1622. - Narrowing the partial suppression of the failing test code (which consists of conditionally using `parse_spec_no_baseline` instead of `parse_spec` in some assertions) so that it is only done if Git is at one of the versions that is known to be affected. If any future Git versions are affected, such as by the currently cooking patch not being merged as soon as I expect, then this could potentially fail on CI again. But that is something we would probably want to find out about.
This builds on GitoxideLabs#1635 by: - Updating the comment about GitoxideLabs#1622, including by adding links to Git mailing list posts by Aarni Koskela, who discovered the bug that turns out to be the cause of this, and Patrick Steinhardt, who analyzed the bug and wrote a fix (currently in testing); and GitoxideLabs#1622 (comment), which summarizes that and reports on its connection to GitoxideLabs#1622. - Narrowing the partial suppression of the failing test code (which consists of conditionally using `parse_spec_no_baseline` instead of `parse_spec` in some assertions) so that it is only done if Git is at one of the versions that is known to be affected. If any future Git versions are affected, such as by the currently cooking patch not being merged as soon as I expect, then this could potentially fail on CI again. But that is something we would probably want to find out about.
I'm not clear on why, but the failure doesn't seem to occur on Windows. Currently the only systems with Git versions in the affected range that we test on CI with It shouldn't be too hard to set something up to notify when a stable
I've opened #1719 for this. |
Thanks again - I was thinking in way too complicated ways there 😅! |
Update: See #1622 (comment) on the cause of this bug.
Current behavior 😯
When
git
is git 2.47.0 and tests are run withGIX_TEST_IGNORE_ARCHIVES=1
to force fixture scripts--and in particular themake_rev_spec_parse_repos.sh
script--to run, thefind_youngest_matching_commit::regex_matches
rev-spec test fails:More specifically:
As detailed below, the assertion that fails is not actually the assertion present directly in the test case function, but is instead an assertion present in a helper that checks against a baseline generated by git when running a fixture script.
I have observed the failures in Ubuntu 24.04 LTS with git 2.47.0 installed from the git-core PPA, in Arch Linux with git 2.47.0-1 (which is a downstream build of git 2.47.0 that is essentially the same), and in Arch Linux with manual builds of the upstream git source code at the 2.47.0 tag. The latter is the most valuable, because I also built the 2.46.2 tag to verify that the failure does not occur. Full details of most runs are in this gist. As shown there, I verified that there are no other failures, that there are no failures when
GIX_TEST_IGNORE_ARCHIVES=1
is not used, and that the failures are not triggered or otherwise affected by the changes in #1620.The cause
The failure is triggered by a change in behavior from git 2.46.2 to git 2.47.0, where the computed baseline values placed in
gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph/baseline.git
differ. The behavior of gitoxide agrees with the behavior of git 2.46.2 but disagrees with the behavior of git 2.47.0. Running the test suite in two worktrees, both at the tip of gitoxide's main branch, anddiff
ing the generatedbaseline.git
file in each of the generated fixture directories, reveals:This is from:
gitoxide/gix/tests/fixtures/make_rev_spec_parse_repos.sh
Lines 351 to 354 in 6487269
Where
baseline
is defined as:gitoxide/gix/tests/fixtures/make_rev_spec_parse_repos.sh
Lines 5 to 11 in 6487269
The change is due to different behavior of
git rev-parse
from git 2.46.2 to git 2.47.0, causing the expectations represented and commented there about which matching item it selected no longer to hold. The change does not seem to relate to any differences in how the different versions of git actually create the fixture repositories. This is shown by running it on both fixture repositories, after running tests in thegitoxide-with-git-2.46.2
andgitoxide-with-git-2.47.0
worktrees in the manner implied by those worktrees' names.Thus, whichever version of git is used to produce the fixture repositories, the effect of
git rev-parse
varies solely by which version of git is used to rungit rev-parse
itself.To show that the test failure is really happening in the baseline comparison and not elsewhere, consider the test failure output with a backtrace:
The
regex_matches
test has multiple assertions, but the failure happens when evaluating an expression to be used in the first assertion:gitoxide/gix/tests/gix/revision/spec/from_bytes/regex.rs
Lines 93 to 101 in 6487269
That assertion is not really what is failing. Rather, failure occurs in an assertion in the implementation of
parse_spec
inutil.rs
.parse_spec
callsparse_spec_opts
, which callscompare_with_baseline
. Becauseparse_spec_opts
passesBaselineExpectation::Same
as theexpectation
parameter ofcompare_with_baseline
, this assertion runs:gitoxide/gix/tests/gix/revision/spec/from_bytes/util.rs
Lines 179 to 185 in 6487269
That assertion is what is actually failing.
Expected behavior 🤔
All tests should pass even when run with
GIX_TEST_IGNORE_ARCHIVES=1
, but I am not sure whether the old or new rev-parse behavior should be preferred. If the new behavior is preferred then I think the behavior of gitoxide will also have to change, and in this case, something should probably be done so thatGIX_TEST_IGNORE_ARCHIVES=1
can be used even with older versions of git.I don't know if the new git behavior is intentional, and if not then this might be considered a bug in git rather than in gitoxide or in gitoxide's test suite. See below regarding the specific git behavior involved here.
To the best of my knowledge, this is the only case where baseline assertions checked in the gitoxide test suite fail outside of Windows. On Windows, we do have such failures, comprising at least some of the failures reported in #1358; see also the comment discussion in #1345, especially at and below #1345 (comment). But as far as I know there is no relationship, conceptual or otherwise, between the failure here and any of the failures in #1358, other than both being related to baseline assertions. Furthermore, the failures there are probably lower priority, because we do not currently ever commit archives generated by running the test suite on Windows with
GIX_TEST_IGNORE_ARCHIVES=1
, and because CI does not run tests that way.In contrast, this issue will eventually cause the
test
job on CI to fail, once the GitHub-hostedubuntu-latest
runner has git 2.47.0. Going by the details there, some macOS images already have git 2.47.0, but no others, yet.Please note that this is not related to #1623, which fixes an unrelated failure in the CI
test
job that is already occurring.Git behavior
The underlying cause here is a change in git behavior. In the hope that it will help identify the specific change, this section examines the specific differences corresponding to the
baseline.git
change shown above:To investigate, I ran:
As expected, the logs
2.46.2
and2.46.2-cross
have the same contents, as do the logs2.47.0
and2.47.0-cross
. The reason I expect this is that, as noted above, I think the only behavioral difference is in thegit rev-parse
commands and that the generated fixture repositories are (aside from theirbaseline.git
files shownrev-parse
results) equivalent in all relevant respects.Here is the content of the
2.46.2
log:In contrast, here is the content of the
2.47.0
log:Steps to reproduce 🕹
Most of the details are covered above, and the exact commands run for all tests can be seen in the gist.
How I built Git
I built, installed, and tested git 2.46.2 and git 2.47.0 on Arch Linux by running:
I could have done
test
andinstall
in the other order (andtest
presumably didn't use theprefix
value in any way). No unexpected failures occurred (i.e., as expected, some tests were reported as broken, but no tests had a status reported as failing).In hindsight, I should have done
gh repo clone git/git -- --recurse-submodules
to get thesha1collisiondetection
submodule. But I believe that makes no difference here: the tests do not seem to relate to the distinction between SHA1 and Hardened SHA1. Furthermore the failures only occur with 2.47.0, not with 2.46.2. Finally, I observe the failure with 2.47.0 when installed from the Arch Linux repositories on this system, as well as when installed from the git-core PPA on a separate Ubuntu 24.04.1 LTS system; I believe both of those builds do have hardened SHA-1.How I tested gitoxide - full runs
Aside from some ad-hoc experiments running the full test suite on other systems to ensure that what I was observing was not due to the system being Arch Linux or to details of how I things up there, I ran the full test suite and varied three things, testing twelve combinations:
main
(6487269) vs. before #1620 was merged (37c1e4c, calledold-main
in the gist)GIX_TEST_IGNORE_ARCHIVES
vs. definingGIX_TEST_IGNORE_ARCHIVES=1
In the gist, logs described as individual runs are actually runs both without, and then with,
GIX_TEST_IGNORE_ARCHIVES=1
.I always ran
gix clean -xde
between runs (including between not definingGIX_TEST_IGNORE_ARCHIVES
and defining it as1
).I ran the test suite with
cargo nextest run --all --no-fail-fast
, varying thegit
command used by prefixingPATH="$HOME/git-2.46.2/bin:$PATH"
,PATH="$HOME/git-2.47.0/bin:$PATH"
, or (for the downstream git 2.47.0-1) no such prefix. For example:PATH="$HOME/git-2.47.0/bin:$PATH" GIX_TEST_IGNORE_ARCHIVES=1 cargo nextest run --all --no-fail-fast
How I tested gitoxide - the specific failing test
To verify that the effect was not due to the interaction of multiple tests, I ran the specific failing test as well, including after running
gix clean -xde
. This is further detailed in the gist and it is also how I produced the output showing the backtrace. Specifically, the backtrace shown above was produced with:cd gix/tests/gix/revision/spec/from_bytes GIX_TEST_IGNORE_ARCHIVES=1 RUST_BACKTRACE=1 cargo nextest run regex_matches
(That used the downstream 2.47.0-1--which behaves the same as 2.47.0, at least with respect to what this issue investigates--and thus omits a
PATH=...
prefix.)The text was updated successfully, but these errors were encountered: