-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to preserve file attributes #106
base: main
Are you sure you want to change the base?
Conversation
added aditional security check adjusted architecture specifics adjusted system specific implementation create system specific path
4eb99d0
to
0405d3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the accidental feature support because of test. 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Jan, I've not completed the whole review yet, but have some intermediate questions/comments so submitting this now and will continue the review later/tomorrow.
Rely on built-in `unix.Timeval` conversion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Jan, this is looking great, a few comments/suggestions below, the only blocker I think is the mtime/atime test cases comment below. Would be happy to pair with you on a call to talk in more detail if that helps, thanks.
Co-authored-by: Sam Salisbury <[email protected]>
…d unix specific tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I missed something here so I will request changes to dismiss the approval for now!
…hip (zip/rar/7zip) Added various test cases that verifies the behaviour.
…permissions) by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Jan, a few more notes below, I still haven't reviewed the tests in detail, but I will try to get that done tomorrow!
target_disk_unix.go
Outdated
if os.Geteuid() != 0 { | ||
return &fs.PathError{Op: "chown", Path: name, Err: os.ErrPermission} | ||
if err := os.Lchown(name, uid, gid); err != nil { | ||
return &fs.PathError{Op: "chown", Path: name, Err: err} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this return the error? Maybe...
return &fs.PathError{Op: "chown", Path: name, Err: err} | |
return fmt.Errorf("chown failed: %w", err) |
Also it would be good to have a test case for this error to check we don't silently fail when we shouldn't. You could force the error by using a nonexistent path maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhm ... I thought it might be nice to wrap the error to have more structure/context, e.g., path/op, but happy to just return the error "straighter".
I will add an explicit test case to verify the error edge-case.
Edit:
The code path is already tested in TestUnpackWithPreserveOwnershipAsNonRoot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, when I read this diff the first time, it looked like it wasn't returning any error at all, now when I see it it seems as though the error was being returned. Sorry for the confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code path is already tested in TestUnpackWithPreserveOwnershipAsNonRoot.
Looking at those test cases, I can't see one where we actively expect an error, so I don't think the coverage is explicit enough to ensure we don't break this behaviour in the future. I will submit a patch to add explicit coverage, you can decide how to use it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, that sounds great. Looking forward to your patch! let me know when we can sync on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Jan, still catching up after the break, I can't remember if I created this patch or not, I think I stopped because I realised there were other issues with tests that should be addressed as well. I am going to re-familiarise myself with this PR today, and will let you know once I'm back up to speed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Sam!
Co-authored-by: Sam Salisbury <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Jan, thanks for responding to the comments/questions so far.
One thing that occurs to me is that if the extractor currently drops all file attributes, then there is a possibility that this is a breaking change for some users, although I think this is unlikely, so I don't think we need to be very concerned. I wonder if it would be a good time to start a changelog to document updates for users to make sure they aren't surprised by the new behaviour?
I think it would be good to spend a little more time on this before merging, to make sure the tests are covering everything we need, and possibly to refactor them a little as well to separate out tests for errors vs happy paths, and remove some of the branching logic, as I think that would make them more readable. Does that sound ok to you? This is probably safe to release as-is, but my concern is that I don't feel confident in the tests yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a marker review for the latest changes, thanks Jan, still working through it as a whole.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK Jan, I left some more comments around testing. I hope these are helpful, I will be available on Slack if you want to follow up quickly with any of these. I think once these are addressed we'll be good to go!
unpack_test.go
Outdated
uid int | ||
gid int | ||
}{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the contents[]
entries contain UID/GID info, what is the meaning of them at the level of the whole test case? Will it always be the case that all files in an archive will have the same UID/GID pair? My instinct is to either specify this separately for each file, or just specify it for all files at the level of the test case, either of these will make the tests easier to understand I think. If there is a reason to have these at test case level as well then I think this should be documented to avoid future confusion, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, yes, you are right. The definition on that level can be dropped. I will work on something that will hopefully reduce the confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 6c8d47a
unpack_unix_test.go
Outdated
// compare archive uid/gid with current user | ||
archiveEntriesEqualCurrentOwner := tc.doesNotSupportOwner | ||
|
||
// If the archive supports owner information, check if the current user is the owner of the test data | ||
if !tc.doesNotSupportOwner && (os.Getuid() != tc.uid || os.Getgid() != tc.gid) { | ||
archiveEntriesEqualCurrentOwner = false | ||
} | ||
|
||
// Unpack should fail if the user is not root and the uid/gid | ||
// in the archive is different from the current user (only | ||
// if the archive supports owner information) | ||
err := extract.Unpack(ctx, dst, src, cfg) | ||
|
||
// chown will only fail if the user is not root and the uid/gid is different | ||
if archiveEntriesEqualCurrentOwner && err != nil { | ||
t.Fatalf("error unpacking archive: %v", err) | ||
} | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will all entries in an archive always have the same uid/gid pair? This seems unlikely to be the case in general.
For more clarity, it would be great if we could set this test up so that the UID/GID
pair was a non-match for the current user (as one test case), and then as a match for the current user (as a separate test case). This would remove the logic and ensure that each test is testing the same thing each time it is run. Right now, this test is testing two different things depending on local conditions.
It might be easier at first to set this up optimisitically by picking an "unlikely" uid/gid pair for the case that they are different, and fail the test early on if they do accidentally match the current process. And it may not be necessary to explicitly test the case where the uid/gid does match, what we really want is to trigger the error when unpack fails due to the failed chown, so maybe just that one test case would be enough.
This is related to #106 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100%, I understand what you mean. I am going to work on an adjustment of the tests that adresses your thoughts and removing complexity of the test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simplified the test function and introduced a helper function (containsDifferentOwnerThenCurrentUser
) so that we can check if the test is expected to fail or not.
Changes are made in edfd38f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhm ... I dropped the test case with the current user id and added another one with an invalid uid
/gid
. Point is that Unix supports 32bit uid
/gid
since kernel version 2.4 (ref) so, -1
would not be a really invalid value for chown
.
However, the helper function from ^ is dropped again and the other changes are made in 064184a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, rolling back on the -1
test-case, -1
is an ivalide value or more a value thats ignored by chown
according to https://github.com/hashicorp/go-extract/actions/runs/12672763343/job/35317521527?pr=106#step:5:10. The sad point is that the call of target.Chown
does not fail, but the uid
/gid
has not been adjusted :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}) | ||
} | ||
} | ||
func TestUnpackWithPreserveOwnershipAsRoot(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to the comment above, we probably want to make sure that the uid/gid pair of at least one arhive entry in this test case does not match the current process, or it would pass trivially in that case. You can probably add a single new test case to the slice of cases with some "unusual" uid/gid pair...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I adressed this comment above in a certain way. uid
/gid
can be 32bit (#106 (comment)) and so -1
is unexpected, but still a valid 32bit integer. Not sure how we can create a more "unusual" ownership.
However, I think I addressed this request in 064184a and 6c8d47a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha, rolling back on ^ uid=-1
is not applied by chown
and fails. However, figured out in #106 (comment) a way to test invalid uid/gid.
Co-authored-by: Sam Salisbury <[email protected]>
…ery test-case need a chown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Jan, thanks, this looks better with the test case level uid/gid removed, the error checking logic doesn't quite look right atm in the tests. I wonder if we can work more closely together tomorrow morning to get this done, I don't think it is far away from being ready at all!
if err == nil { | ||
t.Fatalf("error unpacking archive: %v", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be if err != nil
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the comment above states:
go-extract/unpack_unix_test.go
Lines 81 to 82 in 5c74796
// chown will only fail if the user is not root and the | |
// uid/gid in the archive is different from the current user |
The call of disk.Chown()
as a non-root user will fail, bc/ of missing permissions. Only root
is capable of changing the ownership. Due to the fact that the syscall for chown
is performed as non-root, an error is expected. If that error is not thrown, that would be unexpected.
The test will be skipped for the user root
go-extract/unpack_unix_test.go
Lines 55 to 59 in 5c74796
func TestUnpackWithPreserveOwnershipAsNonRoot(t *testing.T) { | |
if os.Getuid() == 0 { | |
t.Skip("test requires non-root privileges") | |
} |
unpack_unix_test.go
Outdated
if !tc.doesNotSupportOwner && err == nil { | ||
t.Fatalf("error unpacking archive: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be != nil
? As the log message references the error...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is sadly made to an out-of-date state of this PR and already adjusted. The ask for the err == nil
is addressed in #106 (comment).
The check for tc.doesNotSupportOwner
is moved to the top of the test-case:
go-extract/unpack_unix_test.go
Lines 61 to 67 in 5c74796
for _, tc := range testCases { | |
t.Run(tc.name, func(t *testing.T) { | |
// skip test if the archive does not store ownership information | |
if tc.doesNotSupportOwner { | |
t.Skipf("archive %s does not store ownership information", tc.name) | |
} |
Background: If an archive format does not contain ownership details, we can skip the archive format, bc/ the test for failing chown
will never fail, bc/ extracted files will always be the uid
/gid
of the process and the archive entries will for, e.g. zip
(which does not contain file ownership information), will return always the current uid
/gid
(ref)
Invent option to preserver fle attributes, so that go-extract can be adopted in other extraction libraries that offers this functionality.
This PR invents: