Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce fmt.Sprintf allocations in query encoding #2919

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kgeckhart
Copy link

@kgeckhart kgeckhart commented Dec 4, 2024

This PR eliminates some calls to fmt.Sprintf in favor of string concatenation when the values being concatenated are strings. This should reduce the memory required to construct queries with complex objects / larger arrays. Arrays which are not flat should see a further improvement as the prefix is now computed once ahead of time instead of doing it with an fmt.Sprintf call on every element.

Included are some benchmarks which I used to decide that the maps + array use cases with fmt.Sprintf seem okay. Dropping fmt.Sprintf is faster but won't lead to any reduction in allocations and in the case of arrays a size between 100 and 499 is actually more allocations.

String formatting benchmark results

Benchmark_sprintf_strings-11            26738427                44.62 ns/op            8 B/op          1 allocs/op
Benchmark_concat_strings-11             1000000000               0.5688 ns/op          0 B/op          0 allocs/op

Int formatting benchmark results

Benchmark_int_formatting/array_-_sprintf_with_1_size-11                 25135780                48.53 ns/op            5 B/op          1 allocs/op
Benchmark_int_formatting/array_-_sprintf_with_10_size-11                23695940                50.23 ns/op            8 B/op          1 allocs/op
Benchmark_int_formatting/array_-_sprintf_with_100_size-11               23888797                51.02 ns/op            8 B/op          1 allocs/op
Benchmark_int_formatting/array_-_sprintf_with_250_size-11               23499748                50.84 ns/op            8 B/op          1 allocs/op
Benchmark_int_formatting/array_-_sprintf_with_500_size-11               21207704                56.39 ns/op           16 B/op          2 allocs/op
Benchmark_int_formatting/array_-_sprintf_with_1000_size-11              21082281                56.70 ns/op           16 B/op          2 allocs/op
Benchmark_int_formatting/array_-_concat_strconv_with_1_size-11          61038285                18.95 ns/op            5 B/op          1 allocs/op
Benchmark_int_formatting/array_-_concat_strconv_with_10_size-11         65347315                17.87 ns/op            8 B/op          1 allocs/op
Benchmark_int_formatting/array_-_concat_strconv_with_100_size-11        38897104                30.29 ns/op           16 B/op          2 allocs/op
Benchmark_int_formatting/array_-_concat_strconv_with_250_size-11        39651289                29.95 ns/op           16 B/op          2 allocs/op
Benchmark_int_formatting/array_-_concat_strconv_with_500_size-11        40254046                29.64 ns/op           16 B/op          2 allocs/op
Benchmark_int_formatting/array_-_concat_strconv_with_1000_size-11       40166244                29.21 ns/op           16 B/op          2 allocs/op

Map key formatting benchmark results

Benchmark_int_formatting/map_-_sprintf_with_1_size-11               9843427               122.3 ns/op            32 B/op          2 allocs/op
Benchmark_int_formatting/map_-_sprintf_with_10_size-11              9815258               121.7 ns/op            32 B/op          2 allocs/op
Benchmark_int_formatting/map_-_sprintf_with_100_size-11             9635235               124.1 ns/op            32 B/op          2 allocs/op
Benchmark_int_formatting/map_-_sprintf_with_250_size-11             9713184               124.4 ns/op            32 B/op          2 allocs/op
Benchmark_int_formatting/map_-_sprintf_with_500_size-11             8893167               136.0 ns/op            32 B/op          4 allocs/op
Benchmark_int_formatting/map_-_sprintf_with_1000_size-11            8691573               136.6 ns/op            32 B/op          4 allocs/op
Benchmark_int_formatting/map_-_concat_strconv_with_1_size-11        22724402                51.69 ns/op           32 B/op          2 allocs/op
Benchmark_int_formatting/map_-_concat_strconv_with_10_size-11       23549650                51.40 ns/op           32 B/op          2 allocs/op
Benchmark_int_formatting/map_-_concat_strconv_with_100_size-11      18886756                62.81 ns/op           32 B/op          3 allocs/op
Benchmark_int_formatting/map_-_concat_strconv_with_250_size-11      18890992                62.93 ns/op           32 B/op          3 allocs/op
Benchmark_int_formatting/map_-_concat_strconv_with_500_size-11      18890448                62.75 ns/op           32 B/op          3 allocs/op
Benchmark_int_formatting/map_-_concat_strconv_with_1000_size-11     19919188                59.89 ns/op           32 B/op          3 allocs/op

@kgeckhart kgeckhart requested a review from a team as a code owner December 4, 2024 17:20
@Madrigal
Copy link
Contributor

Thanks for your contribution. Just letting you know this is still on our radar, but we've been a bit busy.

For context, was there a particular scenario where this was causing a performance issue for you?

@kgeckhart
Copy link
Author

Thanks for the response. I work on an OSS CloudWatch exporter https://github.com/prometheus-community/yet-another-cloudwatch-exporter/. When running high volumes we see some rather spiky memory and I was trying to reduce it a bit. This shows the top allocators from pprof after running for some time,
image
fmt.Sprintf was a lot higher than I expected it be on the list.

Almost all of the fmt.Sprintf allocations appear to stem from these two functions
image
when serializing the large GetMetricDataInput payload.

// Lists can't have flat members
return newValue(a.values, fmt.Sprintf("%s.%d", prefix, a.size), false)
return newValue(a.values, fmt.Sprintf("%s.%d", a.prefix, a.size), false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why leave Sprintf here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra int formatting for this case causes an increase in allocations when the size of the array being serialized was between 100-500. There's a nice drop in CPU but a lot of the CPU utilization we see is driven by GC so I chose to leave this one alone. I'll edit the full benchmark results in to the PR description.

@lucix-aws
Copy link
Contributor

Can we see some benchmarks that actually use the query encoder for some structure (ideally something copied from say EC2)? This seems fine on paper but all the benchmarks are demonstrating is the difference between these operations in isolation.

@Madrigal
Copy link
Contributor

Madrigal commented Jan 8, 2025

We recently added instructions on how to generate a changelog entry for PRs. In the past, we had to do this on behalf of contributors, slowing down the review process. Can you go through this document and generate a changelog entry for this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants