Add option to preserve file attributes #106

NodyHub · 2024-12-09T05:34:02Z

Invent option to preserver fle attributes, so that go-extract can be adopted in other extraction libraries that offers this functionality.

This PR invents:

Preserve file attributes, access-time, mod-time, file permissions
Preserve file ownership for tar archives
Added symlink support for 7zip (added by accident while writing test cases)
Invented additional security check before file creation (added for the sake of completeness, even if in the currently feature set not necessary)
Increased the max files default value. 1k is a bit annoying on the cli.

added aditional security check adjusted architecture specifics adjusted system specific implementation create system specific path

alanpchua

Love the accidental feature support because of test. 😆

samsalisbury

Hey Jan, I've not completed the whole review yet, but have some intermediate questions/comments so submitting this now and will continue the review later/tomorrow.

target_disk_darwin.go

Rely on built-in `unix.Timeval` conversion

samsalisbury

Hey Jan, this is looking great, a few comments/suggestions below, the only blocker I think is the mtime/atime test cases comment below. Would be happy to pair with you on a call to talk in more detail if that helps, thanks.

target_memory_test.go

unpack_test.go

extractor.go

target_disk_unix.go

target_disk_others.go

Co-authored-by: Sam Salisbury <[email protected]>

…d unix specific tests

samsalisbury

Looks great

samsalisbury

Hey, I missed something here so I will request changes to dismiss the approval for now!

…hip (zip/rar/7zip) Added various test cases that verifies the behaviour.

…permissions) by default

…tready results

samsalisbury

Hi Jan, a few more notes below, I still haven't reviewed the tests in detail, but I will try to get that done tomorrow!

README.md

config.go

target.go

samsalisbury · 2024-12-18T00:23:22Z

target_disk_unix.go

-	if os.Geteuid() != 0 {
-		return &fs.PathError{Op: "chown", Path: name, Err: os.ErrPermission}
+	if err := os.Lchown(name, uid, gid); err != nil {
+		return &fs.PathError{Op: "chown", Path: name, Err: err}


Should this return the error? Maybe...

Suggested change

return &fs.PathError{Op: "chown", Path: name, Err: err}

return fmt.Errorf("chown failed: %w", err)

Also it would be good to have a test case for this error to check we don't silently fail when we shouldn't. You could force the error by using a nonexistent path maybe?

Mhm ... I thought it might be nice to wrap the error to have more structure/context, e.g., path/op, but happy to just return the error "straighter".

I will add an explicit test case to verify the error edge-case.

Edit:
The code path is already tested in TestUnpackWithPreserveOwnershipAsNonRoot.

Ah, when I read this diff the first time, it looked like it wasn't returning any error at all, now when I see it it seems as though the error was being returned. Sorry for the confusion.

The code path is already tested in TestUnpackWithPreserveOwnershipAsNonRoot.

Looking at those test cases, I can't see one where we actively expect an error, so I don't think the coverage is explicit enough to ensure we don't break this behaviour in the future. I will submit a patch to add explicit coverage, you can decide how to use it :)

Hey, that sounds great. Looking forward to your patch! let me know when we can sync on it.

Hey Jan, still catching up after the break, I can't remember if I created this patch or not, I think I stopped because I realised there were other issues with tests that should be addressed as well. I am going to re-familiarise myself with this PR today, and will let you know once I'm back up to speed!

Thank you Sam!

Co-authored-by: Sam Salisbury <[email protected]>

…s more intuitive

samsalisbury

Hi Jan, thanks for responding to the comments/questions so far.

One thing that occurs to me is that if the extractor currently drops all file attributes, then there is a possibility that this is a breaking change for some users, although I think this is unlikely, so I don't think we need to be very concerned. I wonder if it would be a good time to start a changelog to document updates for users to make sure they aren't surprised by the new behaviour?

I think it would be good to spend a little more time on this before merging, to make sure the tests are covering everything we need, and possibly to refactor them a little as well to separate out tests for errors vs happy paths, and remove some of the branching logic, as I think that would make them more readable. Does that sound ok to you? This is probably safe to release as-is, but my concern is that I don't feel confident in the tests yet.

unpack_other_test.go

samsalisbury

Just a marker review for the latest changes, thanks Jan, still working through it as a whole.

samsalisbury

OK Jan, I left some more comments around testing. I hope these are helpful, I will be available on Slack if you want to follow up quickly with any of these. I think once these are addressed we'll be good to go!

samsalisbury · 2025-01-07T13:43:58Z

unpack_test.go

+	uid                   int
+	gid                   int
+}{


Given that the contents[] entries contain UID/GID info, what is the meaning of them at the level of the whole test case? Will it always be the case that all files in an archive will have the same UID/GID pair? My instinct is to either specify this separately for each file, or just specify it for all files at the level of the test case, either of these will make the tests easier to understand I think. If there is a reason to have these at test case level as well then I think this should be documented to avoid future confusion, thanks.

Uh, yes, you are right. The definition on that level can be dropped. I will work on something that will hopefully reduce the confusion.

Addressed in 6c8d47a

samsalisbury · 2025-01-07T13:51:16Z

unpack_unix_test.go

+			// compare archive uid/gid with current user
+			archiveEntriesEqualCurrentOwner := tc.doesNotSupportOwner
+
+			// If the archive supports owner information, check if the current user is the owner of the test data
+			if !tc.doesNotSupportOwner && (os.Getuid() != tc.uid || os.Getgid() != tc.gid) {
+				archiveEntriesEqualCurrentOwner = false
+			}
+
+			// Unpack should fail if the user is not root and the uid/gid
+			// in the archive is different from the current user (only
+			// if the archive supports owner information)
+			err := extract.Unpack(ctx, dst, src, cfg)
+
+			// chown will only fail if the user is not root and the uid/gid is different
+			if archiveEntriesEqualCurrentOwner && err != nil {
+				t.Fatalf("error unpacking archive: %v", err)
+			}
+		})


Will all entries in an archive always have the same uid/gid pair? This seems unlikely to be the case in general.

For more clarity, it would be great if we could set this test up so that the UID/GID pair was a non-match for the current user (as one test case), and then as a match for the current user (as a separate test case). This would remove the logic and ensure that each test is testing the same thing each time it is run. Right now, this test is testing two different things depending on local conditions.

It might be easier at first to set this up optimisitically by picking an "unlikely" uid/gid pair for the case that they are different, and fail the test early on if they do accidentally match the current process. And it may not be necessary to explicitly test the case where the uid/gid does match, what we really want is to trigger the error when unpack fails due to the failed chown, so maybe just that one test case would be enough.

This is related to #106 (comment)

100%, I understand what you mean. I am going to work on an adjustment of the tests that adresses your thoughts and removing complexity of the test cases.

I simplified the test function and introduced a helper function (containsDifferentOwnerThenCurrentUser) so that we can check if the test is expected to fail or not.

Changes are made in edfd38f

Mhm ... I dropped the test case with the current user id and added another one with an invalid uid/gid. Point is that Unix supports 32bit uid/gid since kernel version 2.4 (ref) so, -1 would not be a really invalid value for chown.

However, the helper function from ^ is dropped again and the other changes are made in 064184a

Uh, rolling back on the -1 test-case, -1 is an ivalide value or more a value thats ignored by chown according to https://github.com/hashicorp/go-extract/actions/runs/12672763343/job/35317521527?pr=106#step:5:10. The sad point is that the call of target.Chown does not fail, but the uid/gid has not been adjusted :(

Okay, I adressed that point in 45cb851 and 5c74796. Should be good to go.

samsalisbury · 2025-01-07T13:53:25Z

unpack_unix_test.go

+		})
+	}
+}
+func TestUnpackWithPreserveOwnershipAsRoot(t *testing.T) {


Related to the comment above, we probably want to make sure that the uid/gid pair of at least one arhive entry in this test case does not match the current process, or it would pass trivially in that case. You can probably add a single new test case to the slice of cases with some "unusual" uid/gid pair...

I adressed this comment above in a certain way. uid/gid can be 32bit (#106 (comment)) and so -1 is unexpected, but still a valid 32bit integer. Not sure how we can create a more "unusual" ownership.

However, I think I addressed this request in 064184a and 6c8d47a

Ha, rolling back on ^ uid=-1 is not applied by chown and fails. However, figured out in #106 (comment) a way to test invalid uid/gid.

unpack_unix_test.go

Co-authored-by: Sam Salisbury <[email protected]>

…ery test-case need a chown

samsalisbury

Hi Jan, thanks, this looks better with the test case level uid/gid removed, the error checking logic doesn't quite look right atm in the tests. I wonder if we can work more closely together tomorrow morning to get this done, I don't think it is far away from being ready at all!

samsalisbury · 2025-01-08T16:01:55Z

unpack_unix_test.go

+			if err == nil {
 				t.Fatalf("error unpacking archive: %v", err)
 			}


Should this be if err != nil?

As the comment above states:

go-extract/unpack_unix_test.go

Lines 81 to 82 in 5c74796

// chown will only fail if the user is not root and the

// uid/gid in the archive is different from the current user

The call of disk.Chown() as a non-root user will fail, bc/ of missing permissions. Only root is capable of changing the ownership. Due to the fact that the syscall for chown is performed as non-root, an error is expected. If that error is not thrown, that would be unexpected.

The test will be skipped for the user root

go-extract/unpack_unix_test.go

Lines 55 to 59 in 5c74796

func TestUnpackWithPreserveOwnershipAsNonRoot(t *testing.T) {

if os.Getuid() == 0 {

t.Skip("test requires non-root privileges")

}

samsalisbury · 2025-01-08T17:03:19Z

unpack_unix_test.go

+			if !tc.doesNotSupportOwner && err == nil {
 				t.Fatalf("error unpacking archive: %v", err)


Should this be != nil ? As the log message references the error...

This comment is sadly made to an out-of-date state of this PR and already adjusted. The ask for the err == nil is addressed in #106 (comment).

The check for tc.doesNotSupportOwner is moved to the top of the test-case:

go-extract/unpack_unix_test.go

Lines 61 to 67 in 5c74796

for _, tc := range testCases {

t.Run(tc.name, func(t *testing.T) {

// skip test if the archive does not store ownership information

if tc.doesNotSupportOwner {

t.Skipf("archive %s does not store ownership information", tc.name)

}

Background: If an archive format does not contain ownership details, we can skip the archive format, bc/ the test for failing chown will never fail, bc/ extracted files will always be the uid/gid of the process and the archive entries will for, e.g. zip (which does not contain file ownership information), will return always the current uid/gid (ref)

NodyHub added 2 commits December 5, 2024 12:04

invented option to preserve filemode

8dbb570

added support for updating ownership, file permissions and change time

49b513b

NodyHub linked an issue Dec 9, 2024 that may be closed by this pull request

Feature: Preserve file attributes #78

Open

NodyHub added 2 commits December 9, 2024 06:58

code clean up

1605a68

Further adjustments

0405d3f

added aditional security check adjusted architecture specifics adjusted system specific implementation create system specific path

NodyHub force-pushed the nodyhub/invent-option-preserver-attributes branch from 4eb99d0 to 0405d3f Compare December 9, 2024 11:35

NodyHub marked this pull request as ready for review December 9, 2024 13:52

NodyHub requested a review from a team as a code owner December 9, 2024 13:52

adjusted cli help output

576854a

alanpchua previously approved these changes Dec 9, 2024

View reviewed changes

NodyHub changed the title ~~Invent option preserver attributes~~ Add option to preserve file attributes Dec 9, 2024

NodyHub requested a review from samsalisbury December 9, 2024 14:50

samsalisbury reviewed Dec 9, 2024

View reviewed changes

target_disk_darwin.go Outdated Show resolved Hide resolved

target_disk_darwin.go Outdated Show resolved Hide resolved

adjusted build tag specification

2160ffe

NodyHub dismissed alanpchua’s stale review via 2160ffe December 10, 2024 05:15

NodyHub and others added 5 commits December 10, 2024 10:38

corrected mod time adjustment

4c99ad2

removed code comment and fixed timestamp of archive file

fd3a385

rely on built-in unix.Timeval conversion

1ded861

Merge pull request #109 from hashicorp/simplify-unix-time-handling

593b990

Rely on built-in `unix.Timeval` conversion

unexpose internal functions

03a7cb6

samsalisbury reviewed Dec 11, 2024

View reviewed changes

NodyHub and others added 7 commits December 11, 2024 13:52

Update target_memory_test.go

fb7132a

Co-authored-by: Sam Salisbury <[email protected]>

check for differences >= 1 micro second

f9b2710

simplified test print statement

9e69c9d

invent testing as root to adjust ownership

7fa9d28

transformed canMaintainSymlinkTimestamps into a function and relocate…

8498e88

…d unix specific tests

skip sudo test on windows

900c4d0

added some error handling

96659b8

samsalisbury approved these changes Dec 12, 2024

View reviewed changes

samsalisbury requested changes Dec 12, 2024

View reviewed changes

NodyHub added 9 commits December 17, 2024 10:01

Return current Uid/Gid if archive does not support carriage of owners…

e1ce17d

…hip (zip/rar/7zip) Added various test cases that verifies the behaviour.

inverted logic to preserve file attributes (mod/access time and file …

03a9e84

…permissions) by default

removed windows specifc code

f5b7936

adjusted test cases file split

ccf382e

adjusted test output log

111a61f

drop write permissions on directory if no access is granted

b4c67e9

drop permission check for directories on windows, bc/ they create uns…

7671e15

…tready results

updated test case comment

879edcd

clarified test in comment

450aba3

NodyHub requested a review from samsalisbury December 17, 2024 11:02

updated readme

3c7c310

samsalisbury reviewed Dec 18, 2024

View reviewed changes

NodyHub and others added 5 commits December 18, 2024 10:02

renamed config.noPreserveFileAttributes to config.dropFileAttributes

5ff6279

Update target_disk_unix.go

668a07e

Co-authored-by: Sam Salisbury <[email protected]>

fixed import

49171eb

adjusted short version of flag PreserveOwner from -o to -p , bc/ it i…

9d10a84

…s more intuitive

adjusted comments

5580e05

NodyHub requested a review from samsalisbury December 18, 2024 09:55

samsalisbury reviewed Dec 18, 2024

View reviewed changes

unpack_other_test.go Outdated Show resolved Hide resolved

removed unused field from test

bf3804a

samsalisbury reviewed Jan 7, 2025

View reviewed changes

NodyHub and others added 6 commits January 8, 2025 12:55

Update unpack_unix_test.go

f59ea35

Co-authored-by: Sam Salisbury <[email protected]>

dropped uid/gid from upper test case level and aligned the test case

6c8d47a

simplified chown check

edfd38f

adjusted the current-user-tar test-case to invalid-uid-tar so that ev…

064184a

…ery test-case need a chown

adjusted test for invalid uid/gid

45cb851

adjusted uid casting

5c74796

NodyHub requested a review from samsalisbury January 8, 2025 15:27

samsalisbury reviewed Jan 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to preserve file attributes #106

Add option to preserve file attributes #106

NodyHub commented Dec 9, 2024 •

edited

Loading

alanpchua left a comment

samsalisbury left a comment

samsalisbury left a comment

samsalisbury left a comment

samsalisbury left a comment

samsalisbury left a comment

samsalisbury Dec 18, 2024

NodyHub Dec 18, 2024 •

edited

Loading

samsalisbury Dec 18, 2024

samsalisbury Dec 18, 2024

NodyHub Jan 6, 2025

samsalisbury Jan 7, 2025

NodyHub Jan 7, 2025

samsalisbury left a comment

samsalisbury left a comment

samsalisbury left a comment

samsalisbury Jan 7, 2025

NodyHub Jan 8, 2025

NodyHub Jan 8, 2025

samsalisbury Jan 7, 2025

NodyHub Jan 8, 2025

NodyHub Jan 8, 2025

NodyHub Jan 8, 2025

NodyHub Jan 8, 2025 •

edited

Loading

NodyHub Jan 8, 2025

samsalisbury Jan 7, 2025 •

edited

Loading

NodyHub Jan 8, 2025

NodyHub Jan 8, 2025

samsalisbury left a comment

samsalisbury Jan 8, 2025

NodyHub Jan 9, 2025

samsalisbury Jan 8, 2025

NodyHub Jan 9, 2025

	return &fs.PathError{Op: "chown", Path: name, Err: err}
	return fmt.Errorf("chown failed: %w", err)

	// chown will only fail if the user is not root and the
	// uid/gid in the archive is different from the current user

	func TestUnpackWithPreserveOwnershipAsNonRoot(t *testing.T) {

	if os.Getuid() == 0 {
	t.Skip("test requires non-root privileges")
	}

		if !tc.doesNotSupportOwner && err == nil {
		t.Fatalf("error unpacking archive: %v", err)

	for _, tc := range testCases {
	t.Run(tc.name, func(t *testing.T) {

	// skip test if the archive does not store ownership information
	if tc.doesNotSupportOwner {
	t.Skipf("archive %s does not store ownership information", tc.name)
	}

Add option to preserve file attributes #106

Are you sure you want to change the base?

Add option to preserve file attributes #106

Conversation

NodyHub commented Dec 9, 2024 • edited Loading

alanpchua left a comment

Choose a reason for hiding this comment

samsalisbury left a comment

Choose a reason for hiding this comment

samsalisbury left a comment

Choose a reason for hiding this comment

samsalisbury left a comment

Choose a reason for hiding this comment

samsalisbury left a comment

Choose a reason for hiding this comment

samsalisbury left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NodyHub Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samsalisbury left a comment

Choose a reason for hiding this comment

samsalisbury left a comment

Choose a reason for hiding this comment

samsalisbury left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NodyHub Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samsalisbury Jan 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samsalisbury left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NodyHub commented Dec 9, 2024 •

edited

Loading

NodyHub Dec 18, 2024 •

edited

Loading

NodyHub Jan 8, 2025 •

edited

Loading

samsalisbury Jan 7, 2025 •

edited

Loading