Merge pull request #6601 from azylinski/api_call_latency-bump-ns-30s

Inc objective for read-only requests in "namespace" scope: from 5s to 30s
author: Kubernetes Prow Robot <k8s-ci-robot@users.noreply.github.com> 2022-04-13 11:28:46 -0700
committer: GitHub <noreply@github.com> 2022-04-13 11:28:46 -0700
commit: 14ff21dc98a42f1b79c97838ca1cf69f173ac217 (patch)
tree: 31da95de87dbf9283129b498526c501c87b3d849
parent: 0c661fc33099d2572e2f70b1d4d8fa85b938af07 (diff)
parent: e3c1c5390d5e6eaa966f73038ab503e52ef09937 (diff)
2 files changed, 17 insertions, 9 deletions
diff --git a/sig-scalability/slos/api_call_latency.md b/sig-scalability/slos/api_call_latency.md
index 45764f9c..ff340929 100644
--- a/sig-scalability/slos/api_call_latency.md
+++ b/sig-scalability/slos/api_call_latency.md
@@ -5,23 +5,31 @@
 | Status | SLI | SLO |
 | --- | --- | --- |
 | __Official__ | Latency<sup>[1](#footnote1)</sup> of mutating<sup>[2](#footnote2)</sup> API calls for single objects for every (resource, verb) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, verb) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day <= 1s |
-| __Official__ | Latency<sup>[1](#footnote1)</sup> of non-streaming read-only<sup>[3](#footnote3)</sup> API calls for every (resource, scope<sup>[4](#footnote4)</sup>) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day (a) <= 1s if `scope=resource` (b) <= 5s if `scope=namespace` (c) <= 30s if `scope=cluster` |
+| __Official__ | Latency<sup>[1](#footnote1)</sup> of non-streaming read-only<sup>[3](#footnote3)</sup> API calls for every (resource, scope<sup>[4](#footnote4)</sup>) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day: (a) <= 1s if `scope=resource` (b) <= 30s<sup>[5](#footnote5)</sup> otherwise (if `scope=namespace` or `scope=cluster`) |
 
-<a name="footnote1">\[1\]</a>By latency of API call in this doc we mean time
+<a name="footnote1">\[1\]</a> By latency of API call in this doc we mean time
 from the moment when apiserver gets the request to last byte of response sent
 to the user.
 
-<a name="footnote2">\[2\]</a>By mutating API calls we mean POST, PUT, DELETE
+<a name="footnote2">\[2\]</a> By mutating API calls we mean POST, PUT, DELETE
 and PATCH.
 
-<a name="footnote3">\[3\]</a>By non-streaming read-only API calls we mean GET
+<a name="footnote3">\[3\]</a> By non-streaming read-only API calls we mean GET
 requests without `watch=true` option set. (Note that in Kubernetes internally
 it translates to both GET and LIST calls).
 
-<a name="footnote4">\[4\]</a>A scope of a request can be either (a) `resource`
-if the request is about a single object, (b) `namespace` if it is about objects
-from a single namespace or (c) `cluster` if it spawns objects from multiple
-namespaces.
+<a name="footnote4">\[4\]</a> A scope of a request can be either
+- `resource` - if the request is about a single object
+- `namespace` - if it is about objects from a single namespace
+- `cluster` - if it spawns objects from multiple namespaces
+
+<a name="footnote5">\[5\]</a> Historically, the threshold for LISTs with
+`scope=namespace` was set to 5 seconds. However, the threshold was chosen when
+Kubernetes didn't support the scale it supports today and when individual
+namespace didn't contain tens of thousands (if not more) objects of a given
+type. We adjusted the limits to accommodate the usage patterns change, given
+that users are fine with listing tens of thousands of objects taking more than
+5 seconds.
 
 ### User stories
 - As a user of vanilla Kubernetes, I want some guarantee how quickly I get the
diff --git a/sig-scalability/slos/slos.md b/sig-scalability/slos/slos.md
index 0cdb576b..ef59a56c 100644
--- a/sig-scalability/slos/slos.md
+++ b/sig-scalability/slos/slos.md
@@ -115,7 +115,7 @@ __TODO: Cluster churn should be moved to scalability thresholds.__
 | Status | SLI | SLO | User stories, test scenarios, ... |
 | --- | --- | --- | --- |
 | __Official__ | Latency of mutating API calls for single objects for every (resource, verb) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, verb) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> <= 1s | [Details](./api_call_latency.md) |
-| __Official__ | Latency of non-streaming read-only API calls for every (resource, scope pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> (a) <= 1s if `scope=resource` (b) <= 5s if `scope=namespace` (c) <= 30s if `scope=cluster` | [Details](./api_call_latency.md) |
+| __Official__ | Latency of non-streaming read-only API calls for every (resource, scope) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> (a) <= 1s if `scope=resource` (b) <= 30s otherwise (if `scope=namespace` or `scope=cluster`) | [Details](./api_call_latency.md) |
 | __Official__ | Startup latency of schedulable stateless pods, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> <= 5s | [Details](./pod_startup_latency.md) |
 | __WIP__ | Startup latency of schedulable stateful pods, excluding time to pull images, run init containers, provision volumes (in delayed binding mode) and unmount/detach volumes (from previous pod if needed), measured from pod creation timestamp to when all its containers are reported as started and observed via watch, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> <= X where X depends on storage provider | [Details](./pod_startup_latency.md) |
 | __WIP__ | Latency of programming in-cluster load balancing mechanism (e.g. iptables), measured from when service spec or list of its `Ready` pods change to when it is reflected in load balancing mechanism, measured as 99th percentile over last 5 minutes aggregated across all programmers | In default Kubernetes installation, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> <= X | [Details](./network_programming_latency.md) |
author	Kubernetes Prow Robot <k8s-ci-robot@users.noreply.github.com>	2022-04-13 11:28:46 -0700
committer	GitHub <noreply@github.com>	2022-04-13 11:28:46 -0700
commit	14ff21dc98a42f1b79c97838ca1cf69f173ac217 (patch)
tree	31da95de87dbf9283129b498526c501c87b3d849
parent	0c661fc33099d2572e2f70b1d4d8fa85b938af07 (diff)
parent	e3c1c5390d5e6eaa966f73038ab503e52ef09937 (diff)