diff options
| author | Kubernetes Submit Queue <k8s-merge-robot@users.noreply.github.com> | 2016-10-17 07:43:23 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2016-10-17 07:43:23 -0700 |
| commit | b9d590945146a28ef837c9f155c5bebfacdeb7c8 (patch) | |
| tree | 96060f7fc6ea03102c0b407ba3d6a73494e5f948 | |
| parent | 44e69106f2a8c7520a92239f81809ce9dbcd8d6c (diff) | |
| parent | a218555a5427415774a70698ab0130a94bababfd (diff) | |
Merge pull request #34758 from fgrzadkowski/monitoring_arch
Automatic merge from submit-queue
Add monitoring architecture
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**: This adds a description of monitoring architecture.
**Special notes for your reviewer**: This proposal have been already extensively discussed in [this doc](https://docs.google.com/document/d/1z7R44MUz_5gRLwsVH0S9rOy8W5naM9XE5NrbeGIqO2k); this is just a copy&paste so that it's in our repo.
@kubernetes/autoscaling @kubernetes/sig-instrumentation @DirectXMan12 @davidopp @piosz @derekwaynecarr @thockin
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
NONE
```
| -rw-r--r-- | monitoring_architecture.md | 232 | ||||
| -rw-r--r-- | monitoring_architecture.png | bin | 0 -> 76662 bytes |
2 files changed, 232 insertions, 0 deletions
diff --git a/monitoring_architecture.md b/monitoring_architecture.md new file mode 100644 index 00000000..b1fc51b9 --- /dev/null +++ b/monitoring_architecture.md @@ -0,0 +1,232 @@ +<!-- BEGIN MUNGE: UNVERSIONED_WARNING --> + +<!-- BEGIN STRIP_FOR_RELEASE --> + +<img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING" + width="25" height="25"> +<img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING" + width="25" height="25"> +<img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING" + width="25" height="25"> +<img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING" + width="25" height="25"> +<img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING" + width="25" height="25"> + +<h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2> + +If you are using a released version of Kubernetes, you should +refer to the docs that go with that version. + +Documentation for other releases can be found at +[releases.k8s.io](http://releases.k8s.io). +</strong> +-- + +<!-- END STRIP_FOR_RELEASE --> + +<!-- END MUNGE: UNVERSIONED_WARNING --> + +# Kubernetes monitoring architecture + +## Executive Summary + +Monitoring is split into two pipelines: + +* A **core metrics pipeline** consisting of Kubelet, a resource estimator, a slimmed-down +Heapster called metrics-server, and the API server serving the master metrics API. These +metrics are used by core system components, such as scheduling logic (e.g. scheduler and +horizontal pod autoscaling based on system metrics) and simple out-of-the-box UI components +(e.g. `kubectl top`). This pipeline is not intended for integration with third-party +monitoring systems. +* A **monitoring pipeline** used for collecting various metrics from the system and exposing +them to end-users, as well as to the Horizontal Pod Autoscaler (for custom metrics) and Infrastore +via adapters. Users can choose from many monitoring system vendors, or run none at all. In +open-source, Kubernetes will not ship with a monitoring pipeline, but third-party options +will be easy to install. We expect that such pipelines will typically consist of a per-node +agent and a cluster-level aggregator. + +The architecture is illustrated in the diagram in the Appendix of this doc. + +## Introduction and Objectives + +This document proposes a high-level monitoring architecture for Kubernetes. It covers +a subset of the issues mentioned in the “Kubernetes Monitoring Architecture” doc, +specifically focusing on an architecture (components and their interactions) that +hopefully meets the numerous requirements. We do not specify any particular timeframe +for implementing this architecture, nor any particular roadmap for getting there. + +### Terminology + +There are two types of metrics, system metrics and service metrics. System metrics are +generic metrics that are generally available from every entity that is monitored (e.g. +usage of CPU and memory by container and node). Service metrics are explicitly defined +in application code and exported (e.g. number of 500s served by the API server). Both +system metrics and service metrics can originate from users’ containers or from system +infrastructure components (master components like the API server, addon pods running on +the master, and addon pods running on user nodes). + +We divide system metrics into + +* *core metrics*, which are metrics that Kubernetes understands and uses for operation +of its internal components and core utilities -- for example, metrics used for scheduling +(including the inputs to the algorithms for resource estimation, initial resources/vertical +autoscaling, cluster autoscaling, and horizontal pod autoscaling excluding custom metrics), +the kube dashboard, and “kubectl top.” As of now this would consist of cpu cumulative usage, +memory instantaneous usage, disk usage of pods, disk usage of containers +* *non-core metrics*, which are not interpreted by Kubernetes; we generally assume they +include the core metrics (though not necessarily in a format Kubernetes understands) plus +additional metrics. + +Service metrics can be divided into those produced by Kubernetes infrastructure components +(and thus useful for operation of the Kubernetes cluster) and those produced by user applications. +Service metrics used as input to horizontal pod autoscaling are sometimes called custom metrics. +Of course horizontal pod autoscaling also uses core metrics. + +We consider logging to be separate from monitoring, so logging is outside the scope of +this doc. + +### Requirements + +The monitoring architecture should + +* include a solution that is part of core Kubernetes and + * makes core system metrics about nodes, pods, and containers available via a standard + master API (today the master metrics API), such that core Kubernetes features do not + depend on non-core components + * requires Kubelet to only export a limited set of metrics, namely those required for + core Kubernetes components to correctly operate (this is related to #18770) + * can scale up to at least 5000 nodes + * is small enough that we can require that all of its components be running in all deployment + configurations +* include an out-of-the-box solution that can serve historical data, e.g. to support Initial +Resources and vertical pod autoscaling as well as cluster analytics queries, that depends +only on core Kubernetes +* allow for third-party monitoring solutions that are not part of core Kubernetes and can +be integrated with components like Horizontal Pod Autoscaler that require service metrics + +## Architecture + +We divide our description of the long-term architecture plan into the core metrics pipeline +and the monitoring pipeline. For each, it is necessary to think about how to deal with each +type of metric (core metrics, non-core metrics, and service metrics) from both the master +and minions. + +### Core metrics pipeline + +The core metrics pipeline collects a set of core system metrics. There are two sources for +these metrics + +* Kubelet, providing per-node/pod/container usage information (the current cAdvisor that +is part of Kubelet will be slimmed down to provide only core system metrics) +* a resource estimator that runs as a DaemonSet and turns raw usage values scraped from +Kubelet into resource estimates (values used by scheduler for a more advanced usage-based +scheduler) + +These sources are scraped by a component we call *metrics-server* which is like a slimmed-down +version of today's Heapster. metrics-server stores locally only latest values and has no sinks. +metrics-server exposes the master metrics API. (The configuration described here is similar +to the current Heapster in “standalone” mode.) +[Discovery summarizer](../../docs/proposals/federated-api-servers.md) +makes the master metrics API available to external clients such that from the client’s perspective +it looks the same as talking to the API server. + +Core (system) metrics are handled as described above in all deployment environments. The only +easily replaceable part is resource estimator, which could be replaced by power users. In +theory, metric-server itself can also be substituted, but it’d be similar to substituting +apiserver itself or controller-manager - possible, but not recommended and not supported. + +Eventually the core metrics pipeline might also collect metrics from Kubelet and Docker daemon +themselves (e.g. CPU usage of Kubelet), even though they do not run in containers. + +The core metrics pipeline is intentionally small and not designed for third-party integrations. +“Full-fledged” monitoring is left to third-party systems, which provide the monitoring pipeline +(see next section) and can run on Kubernetes without having to make changes to upstream components. +In this way we can remove the burden we have today that comes with maintaining Heapster as the +integration point for every possible metrics source, sink, and feature. + +#### Infrastore + +We will build an open-source Infrastore component (most likely reusing existing technologies) +for serving historical queries over core system metrics and events, which it will fetch from +the master APIs. Infrastore will expose one or more APIs (possibly just SQL-like queries -- +this is TBD) to handle the following use cases + +* initial resources +* vertical autoscaling +* oldtimer API +* decision-support queries for debugging, capacity planning, etc. +* usage graphs in the [Kubernetes Dashboard](https://github.com/kubernetes/dashboard) + +In addition, it may collect monitoring metrics and service metrics (at least from Kubernetes +infrastructure containers), described in the upcoming sections. + +### Monitoring pipeline + +One of the goals of building a dedicated metrics pipeline for core metrics, as described in the +previous section, is to allow for a separate monitoring pipeline that can be very flexible +because core Kubernetes components do not need to rely on it. By default we will not provide +one, but we will provide an easy way to install one (using a single command, most likely using +Helm). We described the monitoring pipeline in this section. + +Data collected by the monitoring pipeline may contain any sub- or superset of the following groups +of metrics: + +* core system metrics +* non-core system metrics +* service metrics from user application containers +* service metrics from Kubernetes infrastructure containers; these metrics are exposed using +Prometheus instrumentation + +It is up to the monitoring solution to decide which of these are collected. + +In order to enable horizontal pod autoscaling based on custom metrics, the provider of the +monitoring pipeline would also have to create a stateless API adapter that pulls the custom +metrics from the monitoring pipeline and exposes them to the Horizontal Pod Autoscaler. Such +API will be a well defined, versioned API similar to regular APIs. Details of how it will be +exposed or discovered will be covered in a detailed design doc for this component. + +The same approach applies if it is desired to make monitoring pipeline metrics available in +Infrastore. These adapters could be standalone components, libraries, or part of the monitoring +solution itself. + +There are many possible combinations of node and cluster-level agents that could comprise a +monitoring pipeline, including +cAdvisor + Heapster + InfluxDB (or any other sink) +* cAdvisor + collectd + Heapster +* cAdvisor + Prometheus +* snapd + Heapster +* snapd + SNAP cluster-level agent +* Sysdig + +As an example we’ll describe a potential integration with cAdvisor + Prometheus. + +Prometheus has the following metric sources on a node: +* core and non-core system metrics from cAdvisor +* service metrics exposed by containers via HTTP handler in Prometheus format +* [optional] metrics about node itself from Node Exporter (a Prometheus component) + +All of them are polled by the Prometheus cluster-level agent. We can use the Prometheus +cluster-level agent as a source for horizontal pod autoscaling custom metrics by using a +standalone API adapter that proxies/translates between the Prometheus Query Language endpoint +on the Prometheus cluster-level agent and an HPA-specific API. Likewise an adapter can be +used to make the metrics from the monitoring pipeline available in Infrastore. Neither +adapter is necessary if the user does not need the corresponding feature. + +The command that installs cAdvisor+Prometheus should also automatically set up collection +of the metrics from infrastructure containers. This is possible because the names of the +infrastructure containers and metrics of interest are part of the Kubernetes control plane +configuration itself, and because the infrastructure containers export their metrics in +Prometheus format. + +## Appendix: Architecture diagram + +### Open-source monitoring pipeline + + + + + +<!-- BEGIN MUNGE: GENERATED_ANALYTICS --> +[]() +<!-- END MUNGE: GENERATED_ANALYTICS --> diff --git a/monitoring_architecture.png b/monitoring_architecture.png Binary files differnew file mode 100644 index 00000000..570996b7 --- /dev/null +++ b/monitoring_architecture.png |
