-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
network: More efficient caching for Envoy socket addresses (#37832)
An LRU cache was introduced to cache `Envoy::Network::Address` instances because they are expensive to create. These addresses are cached for reading source and destination addresses from `recvmsg` and `recvmmsg` calls on QUIC UDP sockets. The current size of the cache is 4 entries for each IoHandle (i.e. each socket). A locally run CPU profile of Envoy Mobile showed about 1.75% of CPU cycles going towards querying and inserting into the `quic::QuicLRUCache`. Given the small number of elements in the cache, this commit uses a `std::vector` data structure instead of `QuicLRUCache`. `QuicLRUCache`, `std::vector`, and `std::deque` were compared using newly added benchmark tests, and the following were the results: QuicLRUCache: ``` ------------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------------------------------------------------------- BM_GetOrCreateEnvoyAddressInstanceNoCache/iterations:1000 31595 ns 31494 ns 1000 BM_GetOrCreateEnvoyAddressInstanceConnectedSocket/iterations:1000 5538 ns 5538 ns 1000 BM_GetOrCreateEnvoyAddressInstanceUnconnectedSocket/iterations:1000 38918 ns 38814 ns 1000 BM_GetOrCreateEnvoyAddressInstanceUnconnectedSocketLargerCache/iterations:1000 52969 ns 52846 ns 1000 ``` std::deque: ``` ------------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------------------------------------------------------- BM_GetOrCreateEnvoyAddressInstanceNoCache/iterations:1000 31805 ns 31716 ns 1000 BM_GetOrCreateEnvoyAddressInstanceConnectedSocket/iterations:1000 1553 ns 1550 ns 1000 BM_GetOrCreateEnvoyAddressInstanceUnconnectedSocket/iterations:1000 27243 ns 27189 ns 1000 BM_GetOrCreateEnvoyAddressInstanceUnconnectedSocketLargerCache/iterations:1000 39335 ns 39235 ns 1000 ``` std::vector: ``` ------------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------------------------------------------------------- BM_GetOrCreateEnvoyAddressInstanceNoCache/iterations:1000 31960 ns 31892 ns 1000 BM_GetOrCreateEnvoyAddressInstanceConnectedSocket/iterations:1000 1514 ns 1514 ns 1000 BM_GetOrCreateEnvoyAddressInstanceUnconnectedSocket/iterations:1000 26361 ns 26261 ns 1000 BM_GetOrCreateEnvoyAddressInstanceUnconnectedSocketLargerCache/iterations:1000 43987 ns 43738 ns 1000 ``` `std::vector` uses 3.5x less CPU cycles than `quic::QuicLRUCache` and performs very slightly better than `std::deque` at small cache sizes. If considering creating a bigger cache size (e.g. >= 50 entries), `std::deque` may perform better and it's worth profiling, though in such a situation, no cache at all seems to perform better than a cache. Risk Level: low Testing: unit and benchmark tests Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a --------- Signed-off-by: Ali Beyad <[email protected]>
- Loading branch information
Showing
9 changed files
with
269 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
93 changes: 93 additions & 0 deletions
93
test/common/network/io_socket_handle_impl_benchmark_test.cc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
#include <memory> | ||
|
||
#include "source/common/network/io_socket_handle_impl.h" | ||
|
||
#include "test/test_common/network_utility.h" | ||
|
||
#include "absl/strings/str_cat.h" | ||
#include "benchmark/benchmark.h" | ||
|
||
namespace Envoy { | ||
namespace Network { | ||
namespace Test { | ||
|
||
std::vector<sockaddr_storage> getSockAddrSampleAddresses(const int count) { | ||
std::vector<sockaddr_storage> addresses; | ||
for (int i = 0; i < count; i += 4) { | ||
int ip_suffix = 101 + i; | ||
// A sample v6 source address. | ||
addresses.push_back(getV6SockAddr(absl::StrCat("2001:DB8::", ip_suffix), 51234)); | ||
// A sample v6 destination address. | ||
addresses.push_back(getV6SockAddr(absl::StrCat("2001:DB8::", ip_suffix), 443)); | ||
// A sample v4 source address. | ||
addresses.push_back(getV4SockAddr(absl::StrCat("203.0.113.", ip_suffix), 52345)); | ||
// A sample v4 destination address. | ||
addresses.push_back(getV4SockAddr(absl::StrCat("203.0.113.", ip_suffix), 443)); | ||
} | ||
return addresses; | ||
} | ||
|
||
} // namespace Test | ||
|
||
class IoSocketHandleImplTestWrapper { | ||
public: | ||
explicit IoSocketHandleImplTestWrapper(const int cache_size) | ||
: io_handle_(-1, false, absl::nullopt, cache_size) {} | ||
|
||
Address::InstanceConstSharedPtr getOrCreateEnvoyAddressInstances(const sockaddr_storage& ss) { | ||
return io_handle_.getOrCreateEnvoyAddressInstance(ss, Test::getSockAddrLen(ss)); | ||
} | ||
|
||
private: | ||
IoSocketHandleImpl io_handle_; | ||
}; | ||
|
||
static void BM_GetOrCreateEnvoyAddressInstanceNoCache(benchmark::State& state) { | ||
std::vector<sockaddr_storage> addresses = Test::getSockAddrSampleAddresses(/*count=*/4); | ||
IoSocketHandleImplTestWrapper wrapper(/*cache_size=*/0); | ||
for (auto _ : state) { | ||
for (int i = 0; i < 50; ++i) { | ||
benchmark::DoNotOptimize(wrapper.getOrCreateEnvoyAddressInstances(addresses[0])); | ||
benchmark::DoNotOptimize(wrapper.getOrCreateEnvoyAddressInstances(addresses[1])); | ||
} | ||
} | ||
} | ||
BENCHMARK(BM_GetOrCreateEnvoyAddressInstanceNoCache)->Iterations(1000); | ||
|
||
static void BM_GetOrCreateEnvoyAddressInstanceConnectedSocket(benchmark::State& state) { | ||
std::vector<sockaddr_storage> addresses = Test::getSockAddrSampleAddresses(/*count=*/4); | ||
IoSocketHandleImplTestWrapper wrapper(/*cache_size=*/4); | ||
for (auto _ : state) { | ||
for (int i = 0; i < 50; ++i) { | ||
benchmark::DoNotOptimize(wrapper.getOrCreateEnvoyAddressInstances(addresses[0])); | ||
benchmark::DoNotOptimize(wrapper.getOrCreateEnvoyAddressInstances(addresses[1])); | ||
} | ||
} | ||
} | ||
BENCHMARK(BM_GetOrCreateEnvoyAddressInstanceConnectedSocket)->Iterations(1000); | ||
|
||
static void BM_GetOrCreateEnvoyAddressInstanceUnconnectedSocket(benchmark::State& state) { | ||
std::vector<sockaddr_storage> addresses = Test::getSockAddrSampleAddresses(/*count=*/100); | ||
IoSocketHandleImplTestWrapper wrapper(/*cache_size=*/4); | ||
for (auto _ : state) { | ||
for (const sockaddr_storage& ss : addresses) { | ||
benchmark::DoNotOptimize(wrapper.getOrCreateEnvoyAddressInstances(ss)); | ||
} | ||
} | ||
} | ||
BENCHMARK(BM_GetOrCreateEnvoyAddressInstanceUnconnectedSocket)->Iterations(1000); | ||
|
||
static void | ||
BM_GetOrCreateEnvoyAddressInstanceUnconnectedSocketLargerCache(benchmark::State& state) { | ||
std::vector<sockaddr_storage> addresses = Test::getSockAddrSampleAddresses(/*count=*/100); | ||
IoSocketHandleImplTestWrapper wrapper(/*cache_size=*/50); | ||
for (auto _ : state) { | ||
for (const sockaddr_storage& ss : addresses) { | ||
benchmark::DoNotOptimize(wrapper.getOrCreateEnvoyAddressInstances(ss)); | ||
} | ||
} | ||
} | ||
BENCHMARK(BM_GetOrCreateEnvoyAddressInstanceUnconnectedSocketLargerCache)->Iterations(1000); | ||
|
||
} // namespace Network | ||
} // namespace Envoy |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters