Nowadays, it’s very common to have cluster nodes deployed in multiple locations. If so, you might be interested in how is the traffic between workloads distributed across locations. There are multiple ways how to monitor it, however, let’s try to put this information to Istio metrics by extending metrics dimensions.
Metadata exchange
Istio usually reports source_*
and destination_*
labels which contain information about which client sent a request (source) to which server (destination).
To populate those data, Istio proxy needs to exchange information between two peers. That’s done via metadata exchange
filter.
When a client starts a request, the proxy uses context information stored in flat buffers to read metadata that are inserted to request as x-envoy-peer-metadata
HTTP header.
Then server simply extracts the metadata header which means the server has all information from both peers already. To complete the exchange, the server uses exactly the same approach and uses
response header to send metadata to the client.
Important note for us here is that all metadata can be used to extend metrics.
Transfering zone information between peers
Luckily, locality information already exists in the node attributes. However, every coin has two sides, and for us, it means that locality is not exchanged between peers. So let’s do it on our own!
We will use the same approach as in metadata exchange described above.
Client:
- read
node.locality.zone
attribute which we will use forsource_zone
- insert zone into a custom HTTP header. For example
x-envoy-peer-zone
- dispatch the request to server
Server:
- check if
x-envoy-peer-zone
exists (we know we are destination) - extract the header
- set
downstream_peer_zone
attribute which we will use in our metrics
To implement this usecase we will write very simple WASM extension. I’ll use Rust but language choice don’t matter that much here.
First of all, define a name for our header name:
const PEER_LOCALITY_EXCHANGE_HEADER: &str = "x-envoy-peer-zone";
Next, we need to catch a request and create header. For that we will use on_http_request_headers
. But there is a twist.
Client and server will use the same method so we need to distinguish which peer we are. Luckily that’s very easy.
If our exchange header is not set, then we are source so we can read zone from ABI and create header.
If exchange header is set, then we are on our destination (server) and we need to extract metadata and set downstream_peer_zone
attribute.
fn on_http_request_headers(&mut self, _: usize, _: bool) -> Action {
match self.get_http_request_header(PEER_LOCALITY_EXCHANGE_HEADER) {
Some(v) => {
self.set_http_request_header(PEER_LOCALITY_EXCHANGE_HEADER, None);
self.set_property(vec!["downstream_peer_zone"], Some(v.as_bytes()));
}
None => match self.get_property(vec!["node", "locality", "zone"]) {
Some(v) => {
self.set_http_request_header(
PEER_LOCALITY_EXCHANGE_HEADER,
Some(str::from_utf8(&v).unwrap()),
);
}
None => error!("enable to set locality attribute for downstream peer"),
},
}
Action::Continue
}
This already works, but metrics are generated from both sides (destination and source reporter) therefore we need to do the same thing also for response headers.
fn on_http_response_headers(&mut self, _: usize, _: bool) -> Action {
match self.get_http_response_header(PEER_LOCALITY_EXCHANGE_HEADER) {
Some(v) => {
self.set_http_response_header(PEER_LOCALITY_EXCHANGE_HEADER, None);
self.set_property(vec!["upstream_peer_zone"], Some(v.as_bytes()));
}
None => match self.get_property(vec!["node", "locality", "zone"]) {
Some(v) => {
self.set_http_response_header(
PEER_LOCALITY_EXCHANGE_HEADER,
Some(str::from_utf8(&v).unwrap()),
);
}
None => error!("enable to set locality attribute for upstream peer"),
},
}
Action::Continue
}
Easy!
Link to full implementation and build files is at my github: github.com/kirecek/wasm-locality-attribute.
After we deploy the WASM extension, all istio-proxies will contain a two new attributes: downstream_peer_zone
and upstream_peer_zone
which we can use for metrics extension.
Adding zone information to metrics
Once the zone attribute is available, the only thing left is to put it to our metrics. This can be done by patching stat envoyfilters or by using telemetry API which I am going to use and also recommend.
Since we are generating zone attribute from both peers, we need to distinguish where are we generating the labels.
On the client (source of the network traffic), we need to use the default zone attribute and upstream zone information:
- match:
metric: REQUEST_COUNT
mode: CLIENT
tagOverrides:
source_zone:
value: node.locality.zone
destination_zone:
value: upstream_peer_zone
On the server (destination of network traffic), we must use the node zone as destination and for source information from our downstream peer.
- match:
metric: REQUEST_COUNT
mode: SERVER
tagOverrides:
source_zone:
value: downstream_peer_zone
destination_zone:
value: node.locality.zone
Full configuration:
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: example
spec:
metrics:
- providers:
- name: prometheus
overrides:
- match:
metric: REQUEST_COUNT
mode: SERVER
tagOverrides:
source_zone:
value: downstream_peer_zone
destination_zone:
value: node.locality.zone
- match:
metric: REQUEST_COUNT
mode: CLIENT
tagOverrides:
source_zone:
value: node.locality.zone
destination_zone:
value: upstream_peer_zone
use
ALL_METRICS
instead ofREQUEST_COUNT
to affect all metrics.
And that’s it. After everything is deployed, you should shortly observe a new labels in Istio metrics.
kubectl exec -it deploy/${INJECTED_WORKLOAD} -- curl localhost:15000/stats/prometheus | grep istio_requests
in case a new dimensions would break metric name, you need to set
extraStatTags
in Istio global config or as an annotation. For examplesidecar.istio.io/extraStatTags: source_zone,destination_zone