Skip to content

Commit

Permalink
docs: incorporate default load balancer docs into the main site (#3153)
Browse files Browse the repository at this point in the history
Motivation:

The default load balancer documentation is not easy to discover
since it isn't a part of the main doc site.

Modifications:

- Generate the mermaid images and inline as our site build doesn't (yet) support mermaid diagrams.
- Move the docs into the main doc site.
  • Loading branch information
bryce-anderson authored Jan 3, 2025
1 parent 167012a commit c750d92
Show file tree
Hide file tree
Showing 11 changed files with 75 additions and 53 deletions.
4 changes: 3 additions & 1 deletion docs/generation/gradle/validateSite.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ if (!repositories) {

// links that are allowed to be http rather than https
def httpLinks = [
"http://www.slf4j.org"
"http://www.slf4j.org",
"http://www.eckner.com/papers/Algorithms%20for%20Unevenly%20Spaced%20Time%20Series.pdf" // only available via http
]

// links to exclude from validation (for example because they require authentication or use untrusted cert)
Expand Down Expand Up @@ -190,6 +191,7 @@ for (location in ["local", "remote"]) {
conn.connect()
def responseCode = conn.responseCode
if (responseCode != 200 &&
responseCode != 302 && // common redirect status
responseCode != 403/* github auth */ &&
responseCode != 429 /* github rate limiting */) {
errors.add("$submittedHtmlFile: Unexpected HTTP status code `$responseCode` " +
Expand Down
6 changes: 6 additions & 0 deletions docs/modules/ROOT/pages/_partials/nav-versioned.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,13 @@ include::{page-version}@servicetalk-http-router-jersey:ROOT:partial$nav-versione
*** xref:{page-version}@servicetalk-http-security-jersey::index.adoc[Security]
*** xref:{page-version}@servicetalk-data-jackson-jersey::index.adoc[JSON (Jackson)]
* xref:{page-version}@servicetalk-grpc-api::index.adoc[gRPC]
* xref:{page-version}@servicetalk-loadbalancer::index.adoc[Load Balancing]
+
--
include::{page-version}@servicetalk-loadbalancer:ROOT:partial$nav-versioned.adoc[]
--
+
* xref:{page-version}@servicetalk-client-api::service-discovery.adoc[Service Discovery]
* Traffic Resiliency
+
Expand Down
2 changes: 1 addition & 1 deletion servicetalk-loadbalancer/docs/antora.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
#

name: servicetalk-loadbalancer
title: Load balancing
title: Load Balancing
version: SNAPSHOT
nav:
- modules/ROOT/nav.adoc
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -1 +1 @@

** xref:{page-version}@servicetalk-loadbalancer::defaultloadbalancer.adoc[Default Load Balancer]
Original file line number Diff line number Diff line change
@@ -1,21 +1,27 @@
= DefaultLoadBalancer
// Configure {source-root} values based on how this document is rendered: on GitHub or not
ifdef::env-github[]
:source-root:
endif::[]
ifndef::env-github[]
ifndef::source-root[:source-root: https://github.com/apple/servicetalk/blob/{page-origin-refname}]
endif::[]

== What is DefaultLoadBalancer?
= ServiceTalks default Load Balancer

https://github.com/apple/servicetalk/blob/main/servicetalk-loadbalancer/src/main/java/io/servicetalk/loadbalancer/DefaultLoadBalancer.java[DefaultLoadBalancer]
is a refactor of the ServiceTalk
https://github.com/apple/servicetalk/blob/main/servicetalk-loadbalancer/src/main/java/io/servicetalk/loadbalancer/RoundRobinLoadBalancer.java[RoundRobinLoadBalancer]
that is intended to provide a more flexible foundation for building load balancers. It also serves as the basis for new
features that were not possible with RoundRobinLoadBalancer including host scoring, additional selection algorithms, and
outlier detectors.
== What is the default Load Balancer?

The default load balancer is a refactor and generalization of the ServiceTalk Round-Robin load balancer that is intended
to provide a more flexible foundation for building load balancers. It serves as the basis for new features that were not
possible with RoundRobinLoadBalancer including host scoring, additional selection algorithms, and outlier detectors.

=== Relationship Between the LoadBalancer and Connections

The load balancer is structured as a series of https://github.com/apple/servicetalk/blob/main/servicetalk-loadbalancer/src/main/java/io/servicetalk/loadbalancer/Host.java[Host]'s
and those hosts can have many connections. The number of connections to each Host is driven by the number of connections
required to satisfy the request load. Usage of the HTTP/2 protocol will dramatically shrink the necessary number of
connections to each backend, often to 1, and is strongly encouraged.
The load balancer is structured as a series of hosts and those hosts can have many connections. The number of
connections to each host is driven by the number of connections required to satisfy the request load. Usage of the
HTTP/2 protocol will dramatically shrink the necessary number of connections to each backend, often to 1, and is
strongly encouraged.

////
[source,mermaid]
----
flowchart LR
Expand All @@ -32,31 +38,32 @@ flowchart LR
H2 --> C21[ConnectionN:addr-2]
----
////
image::flowdiagram.png[Abstraction Relationships, align="center"]

=== More Host Selection Algorithms

A primary goal of DefaultLoadBalancer was to open the door to alternative host selection algorithms. The
RoundRobinLoadBalancer is limited to round-robin while DefaultLoadBalancer comes out of the box with multiple choices
and the flexibility to add more if necessary.
A primary goal of the default load balancer was to open the door to alternative host selection algorithms. The
round-robin load balancer is limited to round-robin while the default load balancer comes out of the box with multiple
choices and the flexibility to add more if necessary.

==== Power of two Choices (P2C)

Power of two choices (https://ieeexplore.ieee.org/document/963420[P2C]) is an algorithm that allows load balancers to
Power of two choices (https://doi.org/10.1109/71.963420[P2C]) is an algorithm that allows load balancers to
bias traffic toward hosts with a better 'score' while retaining the fairness of random selection. This algorithm is
a random selection process but with a twist: it selects two hosts at random and from those two pick the 'best' where
best is defined by a scoring system. ServiceTalk uses an Exponentially Weighted Moving Average
(http://www.eckner.com/papers/Algorithms%20for%20Unevenly%20Spaced%20Time%20Series.pdf[EWMA]) scoring system by default.
EWMA takes into account the recent latency, failures, and outstanding request load of the destination host when
computing a hosts score.

The P2C selection algorithm has the positive properties
The P2C selection algorithm has a number of favorable properties

* having significantly more fair request distribution between hosts than simple random
* biasing away from low performing hosts
* avoiding the order coalescing problems associated with round-robin

The major downside of P2C are that it's less trivial to understand.

////
[source,mermaid]
----
stateDiagram-v2
Expand All @@ -70,13 +77,17 @@ stateDiagram-v2
evalhealth --> pickscore: both healthy
pickscore --> [*]: select best scored host
----
////
image::p2cdiagram.png[P2C Endpoint Selection, 300, align="center"]

==== Round-Robin

While round-robin is a very common algorithm that is easy to understand. From a local perspective it is an extremely
fair algorithm assuming each request is essentially the same 'cost'. It's downsides include unwanted correlation effects
due to its sequential ordering and inability to score hosts other than outright failure.
fair algorithm assuming each request is essentially the same cost. It's downsides include unwanted correlation effects
due to its sequential ordering and inability to modulate traffic to hosts other than in the case of outright host
failure.

////
[source,mermaid]
----
stateDiagram-v2
Expand All @@ -87,36 +98,39 @@ stateDiagram-v2
eval_health --> select_index: host unhealthy
eval_health --> [*]: host healthy
----
////
image::roundrobindiagram.png[Round, 300, align="center"]

== xDS Outlier Detection

=== What is xDS

In addition to being more modular DefaultLoadBalancer is being introduced with resiliency features found in the xDS
In addition to being more modular default load balancer is being introduced with resiliency features found in the xDS
specification that was pioneered by the Envoy project and is being https://github.com/cncf/xds[advanced by the CNCF].
xDS defines a control plane that allows for the distributed configuration of key components like certificates,
load balancers, etc. In this documentation we will focus on the elements that are relevant to load balancing, referred
to as CDS.
to as Cluster Discovery Service, or CDS.

=== Failure Detection Algorithms

DefaultLoadBalancer was designed to mitigate failures at both layer 4 (connection layer) and layer 7 (request layer).
It supports the following xDS compatible outlier detectors:
The default load balancer was designed to mitigate failures at both layer 4 (connection layer) and layer 7 (request
layer). It supports the following xDS compatible outlier detectors:

* Consecutive failures: ejects hosts that exceed the configured consecutive failures.
* Success rate failure detection: statistical outlier ejection based on success rate.
* Failure percentage outlier detection: ejects hosts with a failure rate that exceeds a fixed threshold.

In addition to the xDS defined outlier detectors DefaultLoadBalancer continues to support consecutive connection failure
detection with active probing: when a host is found to be unreachable it is marked as unhealthy until a new connection
can be established to the host outside the request path.
In addition to the xDS defined outlier detectors default load balancer continues to support consecutive connection
failure detection with active probing: when a host is found to be unreachable it is marked as unhealthy until a new
connection can be established to the host. This probing is done outside the request path.

=== Connection Selection with Outlier Detectors

Connection acquisition failures, request failures, and response latency are used to optimize traffic flows to avoid
unhealthy hosts and _bias_ traffic to hosts with better response times in order to adjust to the observed capacity of
backends. The relationship is shown in the following diagram:

////
[source,mermaid]
----
flowchart TD
Expand Down Expand Up @@ -159,13 +173,16 @@ flowchart TD
HealthChecker <--> HealthIndicator0 & HealthIndicator1
ServiceDiscovery --> |update host set| HostSet --> |rebuild HostSelector \nwith new host set| HostSelector
----
////
image::healthindicatordiagram.png[Health Indicator Structure, align="center"]

=== Connection Acquisition Workflow

By default, ServiceTalk attempts to minimize the connection load to each host. If the situation arises where there is
not a session capable of serving a request then connection acquisition can happen on the request path. The session
acquisition flow is roughly like this:
not an existing session capable of serving a request then connection acquisition can happen on the request path. The
session acquisition flow is roughly like this:

////
[source,mermaid]
----
sequenceDiagram
Expand All @@ -181,16 +198,18 @@ sequenceDiagram
connection-factory->>host: connection created and added to pool
host->>requester: connection returned
----
////
image::connectionacquisitiondiagram.png[Connection Acquisition, align="center"]

== Future Capabilities

=== Weighted Load Balancing

Not all hosts are created equal! Due to different underlying hardware platforms, other tenants on the same host, or even
just a bad cache day, we often find that not all instances of a service have the same capacity. The P2C selection
algorithm can approximate this, but it is only inferred. With
https://github.com/bryce-anderson/servicetalk/blob/bl_anderson/default_loadbalancer_docs/servicetalk-client-api/src/main/java/io/servicetalk/client/api/ServiceDiscoverer.java[ServiceDiscoverer]
or control-plane support we can explicitly propagate weight information to ServiceTalk's DefaultLoadBalancer. Adding
algorithm can bias toward better performing hosts, but if the capacity of a backend is known it can be accounted for
explicitly. With link:{source-root}/servicetalk-client-api/src/main/java/io/servicetalk/client/api/ServiceDiscoverer.java[ServiceDiscoverer]
or control-plane support we can explicitly propagate weight information to ServiceTalk's default load balancer. Adding
weight support to the host selection process will let users leverage this data.

=== Priority Groups
Expand All @@ -206,4 +225,4 @@ maintaining a set of backup destinations to use in the case of local disruptions
When the sizes of two connected clusters grow the number of connections can become burdensome if the load balancer
maintains a full mesh network. Sub-setting can reduce the connection count by only creating connections to a subset of
backends. There are a number of ways to determine ths subset which can range from simple random sub-setting, which is
trivial to implement but suffers from load variance, to more intricate models.
trivial to implement but suffers from load variance, to more intricate models.
27 changes: 11 additions & 16 deletions servicetalk-loadbalancer/docs/modules/ROOT/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -57,26 +57,21 @@ and protocol metrics from the
link:{source-root}/servicetalk-client-api/src/main/java/io/servicetalk/client/api/ConnectionFactory.java[ConnectionFactory]
(e.g. latency, ...) in order to pick a more optimal _Connection_ for each request.

== Implementations
== Implementation

As mentioned earlier the _Client-Side_
link:{source-root}/servicetalk-client-api/src/main/java/io/servicetalk/client/api/LoadBalancer.java[LoadBalancer]
abstraction allows for various protocol-independent _LoadBalancing_ algorithms to be implemented. This section will
discuss the various implementations offered in ServiceTalk by highlighting their characteristics.
abstraction allows for various protocol-independent _LoadBalancing_ algorithms to be implemented. However, the built-in
default Load Balancer is a highly featured implementation that we strongly recommend and creating custom load balancing
solutions should be done as a last resort.

=== Round Robin
=== Default Load Balancer

The ServiceTalk default Load Balancer is the recommended load balancer. It is a modularization of the
link:{source-root}/servicetalk-loadbalancer/src/main/java/io/servicetalk/loadbalancer/RoundRobinLoadBalancer.java[RoundRobinLoadBalancer]
is a common and simple _LoadBalancer_ implementation that is currently the default when creating _Clients_. Its
main goal is to spread the load evenly between all known resolved addresses as provided by the
xref:{page-version}@servicetalk-client-api::service-discovery.adoc[Service Discovery] mechanism.
intended to support features necessary for proxyless service-to-service communication. It supports multiple selection
algorithms including Power of Two Choices (P2C) and Round-Robin as well as layer-7 outlier detection mechanisms
including consecutive failure and outlier detection.

The implementation in ServiceTalk consists of a set of available addresses (typically the servers to connect to) and for
each address it has a set of open `Connections`. Whenever a new request is made the _LoadBalancer_ will pick the next
address from its list of addresses and picks one of its open _Connections_ until it finds an available _Connection_.
When all _Connections_ are in use, it'll try to open a new _Connection_ to that same address. It works in tandem with
_ServiceDiscoverer_, when new addresses are added or addresses are removed it'll update its active addresses set for
future _Connection_ selection. This approach ensures that every address will receive an equal amount of requests on
average across all _Clients_.

NOTE: This approach favors lower selection time over lowering latency and error rates.
See the xref:{page-version}@servicetalk-loadbalancer::defaultloadbalancer.adoc[default Load Balancer] documentation for
more detail.

0 comments on commit c750d92

Please sign in to comment.