summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorKubernetes Prow Robot <k8s-ci-robot@users.noreply.github.com>2019-01-29 14:29:33 -0800
committerGitHub <noreply@github.com>2019-01-29 14:29:33 -0800
commit1d747b3720c06e795b0ef6be3cfff71fc13e8b34 (patch)
tree8e5721f8c1559766d073020c65c605d7c567735c
parent787bc4f78c9e2d23c614103ebc4d477b9a240258 (diff)
parentf439e4146080c9c8c1add6b2ff133efcc1eaa343 (diff)
Merge pull request #3162 from eduartua/issue-3064-grouping-by-sig-instrumentation
Grouping /devel files by SIGs - SIG Instrumentation
-rw-r--r--contributors/devel/README.md4
-rw-r--r--contributors/devel/event-style-guide.md52
-rw-r--r--contributors/devel/instrumentation.md216
-rw-r--r--contributors/devel/logging.md35
-rw-r--r--contributors/devel/sig-instrumentation/event-style-guide.md51
-rw-r--r--contributors/devel/sig-instrumentation/instrumentation.md215
-rw-r--r--contributors/devel/sig-instrumentation/logging.md34
-rw-r--r--contributors/guide/coding-conventions.md2
-rw-r--r--sig-instrumentation/charter.md2
9 files changed, 310 insertions, 301 deletions
diff --git a/contributors/devel/README.md b/contributors/devel/README.md
index a00cd4aa..a0685b5e 100644
--- a/contributors/devel/README.md
+++ b/contributors/devel/README.md
@@ -32,12 +32,12 @@ Guide](http://kubernetes.io/docs/admin/).
* **Hunting flaky tests** ([flaky-tests.md](flaky-tests.md)): We have a goal of 99.9% flake free tests.
Here's how to run your tests many times.
-* **Logging Conventions** ([logging.md](logging.md)): Glog levels.
+* **Logging Conventions** ([logging.md](sig-instrumentation/logging.md)): Glog levels.
* **Profiling Kubernetes** ([profiling.md](sig-scalability/profiling.md)): How to plug in go pprof profiler to Kubernetes.
* **Instrumenting Kubernetes with a new metric**
- ([instrumentation.md](instrumentation.md)): How to add a new metrics to the
+ ([instrumentation.md](sig-instrumentation/instrumentation.md)): How to add a new metrics to the
Kubernetes code base.
* **Coding Conventions** ([coding-conventions.md](../guide/coding-conventions.md)):
diff --git a/contributors/devel/event-style-guide.md b/contributors/devel/event-style-guide.md
index bc4ba22b..52356d36 100644
--- a/contributors/devel/event-style-guide.md
+++ b/contributors/devel/event-style-guide.md
@@ -1,51 +1,3 @@
-# Event style guide
-
-Status: During Review
-
-Author: Marek Grabowski (gmarek@)
-
-## Why the guide?
-
-The Event API change proposal is the first step towards having useful Events in the system. Another step is to formalize the Event style guide, i.e. set of properties that developers need to ensure when adding new Events to the system. This is necessary to ensure that we have a system in which all components emit consistently structured Events.
-
-## When to emit an Event?
-
-Events are expected to provide important insights for the application developer/operator on the state of their application. Events relevant to cluster administrators are acceptable, as well, though they usually also have the option of looking at component logs. Events are much more expensive than logs, thus they're not expected to provide in-depth system debugging information. Instead concentrate on things that are important from the application developer's perspective. Events need to be either actionable, or be useful to understand past or future system's behavior. Events are not intended to drive automation. Watching resource status should be sufficient for controllers.
-
-Following are the guidelines for adding Events to the system. Those are not hard-and-fast rules, but should be considered by all contributors adding new Events and members doing reviews.
-1. Emit events only when state of the system changes/attempts to change. Events "it's still running" are not interesting. Also, changes that do not add information beyond what is observable by watching the altered resources should not be duplicated as events. Note that adding a reason for some action that can't be inferred from the state change is considered additional information.
-1. Limit Events to no more than one per change/attempt. There's no need for Events on "About to do X" AND "Did X"/"Failed to do X". Result is more interesting and implies an attempt.
- 1. It may give impression that this gets tricky with scale events, e.g. Deployment scales ReplicaSet which creates/deletes Pods. For us those are 3 (or more) separate Events (3 different objects are affected) so it's fine to emit multiple Events.
-1. When an error occurs that prevents a user application from starting or from enacting other normal system behavior, such as object creation, an Event should be emitted (e.g. invalid image).
- 1. Note that Events are garbage collected so every user-actionable error needs to be surfaced via resource status as well.
- 1. It's usually OK to emit failure Events for each failure. Dedup mechanism will deal with that. The exception is failures that are frequent but typically ephemeral and automatically repairable/recoverable, such as broken socket connections, in which case they should only be reported if persistent and unrepairable, in order to mitigate event spam.
-1. When a user application stops running for any reason, an Event should be emitted (e.g. Pod evicted because Node is under memory pressure)
-1. If it's a system-wide change of state that may impact currently running applications or have an may have severe impact on future workload schedulability, an Event should be emitted (e.g. Node became unreachable, 1. Failed to create route for Node).
-1. If it doesn't fit any of above scenarios you should consider not emitting Event.
-
-## How to structure an Event?
-New Event API tries to use more descriptive field names to influence how Events are structured. Event has following fields:
-* Regarding
-* Related
-* ReportingController
-* ReportingInstance
-* Action
-* Reason
-* Type
-* Note
-
-The Event should be structured in a way that following sentence "makes sense":
-"Regarding <Event.Regarding>: <Event.Action> <Event.Related> - <Event.Reason>", e.g.
-* Regarding Node X: BecameNotReady - NodeUnreachable
-* Regarding Pod X: ScheduledOnNode Node Y - <nil>
-* Regarding PVC X: BoundToNode Node Y - <nil>
-* Regarding Pod X: KilledContainer Container Y - NodeMemoryPressure
-
-1. ReportingController is a type of a Controller reporting an Event, e.g. k8s.io/node-controller, k8s.io/kubelet. There will be a standard list for controller names for Kubernetes components. Third-party components must namespace themselves in the same manner as label keys. Validation ensures it's a proper qualified name. This shouldn’t be needed in order for users to understand the event, but is provided in case the controller’s logs need to be accessed for further debugging.
-1. ReportingInstance is an identifier of the instance of the ReportingController which needs to uniquely identify it. I.e. host name can be used only for controllers that are guaranteed to be unique on the host. This requirement isn't met e.g. for scheduler, so it may need a secondary index. For singleton controllers use Node name (or hostname if controller is not running on the Node). Can have at most 128 alpha-numeric characters.
-1. Regarding and Related are ObjectReferences. Regarding should represent the object that's implemented by the ReportingController, Related can contain additional information about another object that takes part in or is affected by the Action (see examples).
-1. Action is a low-cardinality (meaning that there's a restricted, predefined set of values allowed) CamelCase string field (i.e. its value has to be determined at compile time) that explains what happened with Regarding/what action did the ReportingController take in Regarding's name. The tuple of {ReportingController, Action, Reason} must be unique, such that a user could look up documentation. Can have at most 128 characters.
-1. Reason is a low-cardinality CamelCase string field (i.e. its value has to be determined at compile time) that explains why ReportingController took Action. Can have at most 128 characters.
-1. Type can be either "Normal" or "Warning". "Warning" types are reserved for Events that represent a situation that's not expected in a healthy cluster and/or healthy workload: something unexpected and/or undesirable, at least if it occurs frequently enough and/or for a long enough duration.
-1. Note can contain an arbitrary, high-cardinality, user readable summary of the Event. This field can lose data if deduplication is triggered. Can have at most 1024 characters.
+This file has moved to https://git.k8s.io/community/contributors/devel/sig-instrumentation/event-style-guide.md.
+This file is a placeholder to preserve links. Please remove by April 28, 2019 or the release of kubernetes 1.13, whichever comes first. \ No newline at end of file
diff --git a/contributors/devel/instrumentation.md b/contributors/devel/instrumentation.md
index b0a11193..110359b2 100644
--- a/contributors/devel/instrumentation.md
+++ b/contributors/devel/instrumentation.md
@@ -1,215 +1,3 @@
-## Instrumenting Kubernetes
-
-The following references and outlines general guidelines for metric instrumentation
-in Kubernetes components. Components are instrumented using the
-[Prometheus Go client library](https://github.com/prometheus/client_golang). For non-Go
-components. [Libraries in other languages](https://prometheus.io/docs/instrumenting/clientlibs/)
-are available.
-
-The metrics are exposed via HTTP in the
-[Prometheus metric format](https://prometheus.io/docs/instrumenting/exposition_formats/),
-which is open and well-understood by a wide range of third party applications and vendors
-outside of the Prometheus eco-system.
-
-The [general instrumentation advice](https://prometheus.io/docs/practices/instrumentation/)
-from the Prometheus documentation applies. This document reiterates common pitfalls and some
-Kubernetes specific considerations.
-
-Prometheus metrics are cheap as they have minimal internal memory state. Set and increment
-operations are thread safe and take 10-25 nanoseconds (Go &amp; Java).
-Thus, instrumentation can and should cover all operationally relevant aspects of an application,
-internal and external.
-
-## Quick Start
-
-The following describes the basic steps required to add a new metric (in Go).
-
-1. Import "github.com/prometheus/client_golang/prometheus".
-
-2. Create a top-level var to define the metric. For this, you have to:
-
- 1. Pick the type of metric. Use a Gauge for things you want to set to a
-particular value, a Counter for things you want to increment, or a Histogram or
-Summary for histograms/distributions of values (typically for latency).
-Histograms are better if you're going to aggregate the values across jobs, while
-summaries are better if you just want the job to give you a useful summary of
-the values.
- 2. Give the metric a name and description.
- 3. Pick whether you want to distinguish different categories of things using
-labels on the metric. If so, add "Vec" to the name of the type of metric you
-want and add a slice of the label names to the definition.
-
- [Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L53)
- ```go
- requestCounter = prometheus.NewCounterVec(
- prometheus.CounterOpts{
- Name: "apiserver_request_count",
- Help: "Counter of apiserver requests broken out for each verb, API resource, client, and HTTP response code.",
- },
- []string{"verb", "resource", "client", "code"},
- )
- ```
-
-3. Register the metric so that prometheus will know to export it.
-
- [Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L78)
- ```go
- func init() {
- prometheus.MustRegister(requestCounter)
- prometheus.MustRegister(requestLatencies)
- prometheus.MustRegister(requestLatenciesSummary)
- }
- ```
-
-4. Use the metric by calling the appropriate method for your metric type (Set,
-Inc/Add, or Observe, respectively for Gauge, Counter, or Histogram/Summary),
-first calling WithLabelValues if your metric has any labels
-
- [Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L87)
- ```go
- requestCounter.WithLabelValues(*verb, *resource, client, strconv.Itoa(*httpCode)).Inc()
- ```
-
-
-## Instrumentation types
-
-Components have metrics capturing events and states that are inherent to their
-application logic. Examples are request and error counters, request latency
-histograms, or internal garbage collection cycles. Those metrics are instrumented
-directly in the application code.
-
-Secondly, there are business logic metrics. Those are not about observed application
-behavior but abstract system state, such as desired replicas for a deployment.
-They are not directly instrumented but collected from otherwise exposed data.
-
-In Kubernetes they are generally captured in the [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics)
-component, which reads them from the API server.
-For this types of metric exposition, the
-[exporter guidelines](https://prometheus.io/docs/instrumenting/writing_exporters/)
-apply additionally.
-
-## Naming
-
-Metrics added directly by application or package code should have a unique name.
-This avoids collisions of metrics added via dependencies. They also clearly
-distinguish metrics collected with different semantics. This is solved through
-prefixes:
-
-```
-<component_name>_<metric>
-```
-
-For example, suppose the kubelet instrumented its HTTP requests but also uses
-an HTTP router providing its own implementation. Both expose metrics on total
-http requests. They should be distinguishable as in:
-
-```
-kubelet_http_requests_total{path=”/some/path”,status=”200”}
-routerpkg_http_requests_total{path=”/some/path”,status=”200”,method=”GET”}
-```
-
-As we can see they expose different labels and thus a naming collision would
-not have been possible to resolve even if both metrics counted the exact same
-requests.
-
-Resource objects that occur in names should inherit the spelling that is used
-in kubectl, i.e. daemon sets are `daemonset` rather than `daemon_set`.
-
-## Dimensionality & Cardinality
-
-Metrics can often replace more expensive logging as they are time-aggregated
-over a sampling interval. The [multidimensional data model](https://prometheus.io/docs/concepts/data_model/)
-enables deep insights and all metrics should use those label dimensions
-where appropriate.
-
-A common error that often causes performance issues in the ingesting metric
-system is considering dimensions that inhibit or eliminate time aggregation
-by being too specific. Typically those are user IDs or error messages.
-More generally: one should know a comprehensive list of all possible values
-for a label at instrumentation time.
-
-Notable exceptions are exporters like kube-state-metrics, which expose per-pod
-or per-deployment metrics, which are theoretically unbound over time as one could
-constantly create new ones, with new names. However, they have
-a reasonable upper bound for a given size of infrastructure they refer to and
-its typical frequency of changes.
-
-In general, “external” labels like pod or node name do not belong in the
-instrumentation itself. They are to be attached to metrics by the collecting
-system that has the external knowledge ([blog post](https://www.robustperception.io/target-labels-are-for-life-not-just-for-christmas/)).
-
-## Normalization
-
-Metrics should be normalized with respect to their dimensions. They should
-expose the minimal set of labels, each of which provides additional information.
-Labels that are composed from values of different labels are not desirable.
-For example:
-
-```
-example_metric{pod=”abc”,container=”proxy”,container_long=”abc/proxy”}
-```
-
-It often seems feasible to add additional meta information about an object
-to all metrics about that object, e.g.:
-
-```
-kube_pod_container_restarts{namespace=...,pod=...,container=...}
-```
-
-A common use case is wanting to look at such metrics w.r.t to the node the
-pod is scheduled on. So it seems convenient to add a “node” label.
-
-```
-kube_pod_container_restarts{namespace=...,pod=...,container=...,node=...}
-```
-
-This however only caters to one specific query use case. There are many more
-pieces of metadata that could be added, effectively blowing up the instrumentation.
-They are also not guaranteed to be stable over time. What if pods at some
-point can be live migrated?
-Those pieces of information should be normalized into an info-level metric
-([blog post](https://www.robustperception.io/exposing-the-software-version-to-prometheus/)),
-which is always set to 1. For example:
-
-```
-kube_pod_info{pod=...,namespace=...,pod_ip=...,host_ip=..,node=..., ...}
-```
-
-The metric system can later denormalize those along the identifying labels
-“pod” and “namespace” labels. This leads to...
-
-## Resource Referencing
-
-It is often desirable to correlate different metrics about a common object,
-such as a pod. Label dimensions can be used to match up different metrics.
-This is most easy if label names and values are following a common pattern.
-For metrics exposed by the same application, that often happens naturally.
-
-For a system composed of several independent, and also pluggable components,
-it makes sense to set cross-component standards to allow easy querying in
-metric systems without extensive post-processing of data.
-In Kubernetes, those are the resource objects such as deployments,
-pods, or services and the namespace they belong to.
-
-The following should be consistently used:
-
-```
-example_metric_ccc{pod=”example-app-5378923”, namespace=”default”}
-```
-
-An object is referenced by its unique name in a label named after the resource
-itself (i.e. `pod`/`deployment`/... and not `pod_name`/`deployment_name`)
-and the namespace it belongs to in the `namespace` label.
-
-Note: namespace/name combinations are only unique at a certain point in time.
-For time series this is given by the timestamp associated with any data point.
-UUIDs are truly unique but not convenient to use in user-facing time series
-queries.
-They can still be incorporated using an info level metric as described above for
-`kube_pod_info`. A query to a metric system selecting by UUID via a the info level
-metric could look as follows:
-
-```
-kube_pod_restarts and on(namespace, pod) kube_pod_info{uuid=”ABC”}
-```
+This file has moved to https://git.k8s.io/community/contributors/devel/sig-instrumentation/instrumentation.md.
+This file is a placeholder to preserve links. Please remove by April 28, 2019 or the release of kubernetes 1.13, whichever comes first. \ No newline at end of file
diff --git a/contributors/devel/logging.md b/contributors/devel/logging.md
index c4da6829..d857bc64 100644
--- a/contributors/devel/logging.md
+++ b/contributors/devel/logging.md
@@ -1,34 +1,3 @@
-## Logging Conventions
+This file has moved to https://git.k8s.io/community/contributors/devel/sig-instrumentation/logging.md.
-The following conventions for the klog levels to use.
-[klog](http://godoc.org/github.com/kubernetes/klog) is globally preferred to
-[log](http://golang.org/pkg/log/) for better runtime control.
-
-* klog.Errorf() - Always an error
-
-* klog.Warningf() - Something unexpected, but probably not an error
-
-* klog.Infof() has multiple levels:
- * klog.V(0) - Generally useful for this to ALWAYS be visible to an operator
- * Programmer errors
- * Logging extra info about a panic
- * CLI argument handling
- * klog.V(1) - A reasonable default log level if you don't want verbosity.
- * Information about config (listening on X, watching Y)
- * Errors that repeat frequently that relate to conditions that can be corrected (pod detected as unhealthy)
- * klog.V(2) - Useful steady state information about the service and important log messages that may correlate to significant changes in the system. This is the recommended default log level for most systems.
- * Logging HTTP requests and their exit code
- * System state changing (killing pod)
- * Controller state change events (starting pods)
- * Scheduler log messages
- * klog.V(3) - Extended information about changes
- * More info about system state changes
- * klog.V(4) - Debug level verbosity
- * Logging in particularly thorny parts of code where you may want to come back later and check it
- * klog.V(5) - Trace level verbosity
- * Context to understand the steps leading up to errors and warnings
- * More information for troubleshooting reported issues
-
-As per the comments, the practical default level is V(2). Developers and QE
-environments may wish to run at V(3) or V(4). If you wish to change the log
-level, you can pass in `-v=X` where X is the desired maximum level to log.
+This file is a placeholder to preserve links. Please remove by April 28, 2019 or the release of kubernetes 1.13, whichever comes first. \ No newline at end of file
diff --git a/contributors/devel/sig-instrumentation/event-style-guide.md b/contributors/devel/sig-instrumentation/event-style-guide.md
new file mode 100644
index 00000000..bc4ba22b
--- /dev/null
+++ b/contributors/devel/sig-instrumentation/event-style-guide.md
@@ -0,0 +1,51 @@
+# Event style guide
+
+Status: During Review
+
+Author: Marek Grabowski (gmarek@)
+
+## Why the guide?
+
+The Event API change proposal is the first step towards having useful Events in the system. Another step is to formalize the Event style guide, i.e. set of properties that developers need to ensure when adding new Events to the system. This is necessary to ensure that we have a system in which all components emit consistently structured Events.
+
+## When to emit an Event?
+
+Events are expected to provide important insights for the application developer/operator on the state of their application. Events relevant to cluster administrators are acceptable, as well, though they usually also have the option of looking at component logs. Events are much more expensive than logs, thus they're not expected to provide in-depth system debugging information. Instead concentrate on things that are important from the application developer's perspective. Events need to be either actionable, or be useful to understand past or future system's behavior. Events are not intended to drive automation. Watching resource status should be sufficient for controllers.
+
+Following are the guidelines for adding Events to the system. Those are not hard-and-fast rules, but should be considered by all contributors adding new Events and members doing reviews.
+1. Emit events only when state of the system changes/attempts to change. Events "it's still running" are not interesting. Also, changes that do not add information beyond what is observable by watching the altered resources should not be duplicated as events. Note that adding a reason for some action that can't be inferred from the state change is considered additional information.
+1. Limit Events to no more than one per change/attempt. There's no need for Events on "About to do X" AND "Did X"/"Failed to do X". Result is more interesting and implies an attempt.
+ 1. It may give impression that this gets tricky with scale events, e.g. Deployment scales ReplicaSet which creates/deletes Pods. For us those are 3 (or more) separate Events (3 different objects are affected) so it's fine to emit multiple Events.
+1. When an error occurs that prevents a user application from starting or from enacting other normal system behavior, such as object creation, an Event should be emitted (e.g. invalid image).
+ 1. Note that Events are garbage collected so every user-actionable error needs to be surfaced via resource status as well.
+ 1. It's usually OK to emit failure Events for each failure. Dedup mechanism will deal with that. The exception is failures that are frequent but typically ephemeral and automatically repairable/recoverable, such as broken socket connections, in which case they should only be reported if persistent and unrepairable, in order to mitigate event spam.
+1. When a user application stops running for any reason, an Event should be emitted (e.g. Pod evicted because Node is under memory pressure)
+1. If it's a system-wide change of state that may impact currently running applications or have an may have severe impact on future workload schedulability, an Event should be emitted (e.g. Node became unreachable, 1. Failed to create route for Node).
+1. If it doesn't fit any of above scenarios you should consider not emitting Event.
+
+## How to structure an Event?
+New Event API tries to use more descriptive field names to influence how Events are structured. Event has following fields:
+* Regarding
+* Related
+* ReportingController
+* ReportingInstance
+* Action
+* Reason
+* Type
+* Note
+
+The Event should be structured in a way that following sentence "makes sense":
+"Regarding <Event.Regarding>: <Event.Action> <Event.Related> - <Event.Reason>", e.g.
+* Regarding Node X: BecameNotReady - NodeUnreachable
+* Regarding Pod X: ScheduledOnNode Node Y - <nil>
+* Regarding PVC X: BoundToNode Node Y - <nil>
+* Regarding Pod X: KilledContainer Container Y - NodeMemoryPressure
+
+1. ReportingController is a type of a Controller reporting an Event, e.g. k8s.io/node-controller, k8s.io/kubelet. There will be a standard list for controller names for Kubernetes components. Third-party components must namespace themselves in the same manner as label keys. Validation ensures it's a proper qualified name. This shouldn’t be needed in order for users to understand the event, but is provided in case the controller’s logs need to be accessed for further debugging.
+1. ReportingInstance is an identifier of the instance of the ReportingController which needs to uniquely identify it. I.e. host name can be used only for controllers that are guaranteed to be unique on the host. This requirement isn't met e.g. for scheduler, so it may need a secondary index. For singleton controllers use Node name (or hostname if controller is not running on the Node). Can have at most 128 alpha-numeric characters.
+1. Regarding and Related are ObjectReferences. Regarding should represent the object that's implemented by the ReportingController, Related can contain additional information about another object that takes part in or is affected by the Action (see examples).
+1. Action is a low-cardinality (meaning that there's a restricted, predefined set of values allowed) CamelCase string field (i.e. its value has to be determined at compile time) that explains what happened with Regarding/what action did the ReportingController take in Regarding's name. The tuple of {ReportingController, Action, Reason} must be unique, such that a user could look up documentation. Can have at most 128 characters.
+1. Reason is a low-cardinality CamelCase string field (i.e. its value has to be determined at compile time) that explains why ReportingController took Action. Can have at most 128 characters.
+1. Type can be either "Normal" or "Warning". "Warning" types are reserved for Events that represent a situation that's not expected in a healthy cluster and/or healthy workload: something unexpected and/or undesirable, at least if it occurs frequently enough and/or for a long enough duration.
+1. Note can contain an arbitrary, high-cardinality, user readable summary of the Event. This field can lose data if deduplication is triggered. Can have at most 1024 characters.
+
diff --git a/contributors/devel/sig-instrumentation/instrumentation.md b/contributors/devel/sig-instrumentation/instrumentation.md
new file mode 100644
index 00000000..b0a11193
--- /dev/null
+++ b/contributors/devel/sig-instrumentation/instrumentation.md
@@ -0,0 +1,215 @@
+## Instrumenting Kubernetes
+
+The following references and outlines general guidelines for metric instrumentation
+in Kubernetes components. Components are instrumented using the
+[Prometheus Go client library](https://github.com/prometheus/client_golang). For non-Go
+components. [Libraries in other languages](https://prometheus.io/docs/instrumenting/clientlibs/)
+are available.
+
+The metrics are exposed via HTTP in the
+[Prometheus metric format](https://prometheus.io/docs/instrumenting/exposition_formats/),
+which is open and well-understood by a wide range of third party applications and vendors
+outside of the Prometheus eco-system.
+
+The [general instrumentation advice](https://prometheus.io/docs/practices/instrumentation/)
+from the Prometheus documentation applies. This document reiterates common pitfalls and some
+Kubernetes specific considerations.
+
+Prometheus metrics are cheap as they have minimal internal memory state. Set and increment
+operations are thread safe and take 10-25 nanoseconds (Go &amp; Java).
+Thus, instrumentation can and should cover all operationally relevant aspects of an application,
+internal and external.
+
+## Quick Start
+
+The following describes the basic steps required to add a new metric (in Go).
+
+1. Import "github.com/prometheus/client_golang/prometheus".
+
+2. Create a top-level var to define the metric. For this, you have to:
+
+ 1. Pick the type of metric. Use a Gauge for things you want to set to a
+particular value, a Counter for things you want to increment, or a Histogram or
+Summary for histograms/distributions of values (typically for latency).
+Histograms are better if you're going to aggregate the values across jobs, while
+summaries are better if you just want the job to give you a useful summary of
+the values.
+ 2. Give the metric a name and description.
+ 3. Pick whether you want to distinguish different categories of things using
+labels on the metric. If so, add "Vec" to the name of the type of metric you
+want and add a slice of the label names to the definition.
+
+ [Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L53)
+ ```go
+ requestCounter = prometheus.NewCounterVec(
+ prometheus.CounterOpts{
+ Name: "apiserver_request_count",
+ Help: "Counter of apiserver requests broken out for each verb, API resource, client, and HTTP response code.",
+ },
+ []string{"verb", "resource", "client", "code"},
+ )
+ ```
+
+3. Register the metric so that prometheus will know to export it.
+
+ [Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L78)
+ ```go
+ func init() {
+ prometheus.MustRegister(requestCounter)
+ prometheus.MustRegister(requestLatencies)
+ prometheus.MustRegister(requestLatenciesSummary)
+ }
+ ```
+
+4. Use the metric by calling the appropriate method for your metric type (Set,
+Inc/Add, or Observe, respectively for Gauge, Counter, or Histogram/Summary),
+first calling WithLabelValues if your metric has any labels
+
+ [Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L87)
+ ```go
+ requestCounter.WithLabelValues(*verb, *resource, client, strconv.Itoa(*httpCode)).Inc()
+ ```
+
+
+## Instrumentation types
+
+Components have metrics capturing events and states that are inherent to their
+application logic. Examples are request and error counters, request latency
+histograms, or internal garbage collection cycles. Those metrics are instrumented
+directly in the application code.
+
+Secondly, there are business logic metrics. Those are not about observed application
+behavior but abstract system state, such as desired replicas for a deployment.
+They are not directly instrumented but collected from otherwise exposed data.
+
+In Kubernetes they are generally captured in the [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics)
+component, which reads them from the API server.
+For this types of metric exposition, the
+[exporter guidelines](https://prometheus.io/docs/instrumenting/writing_exporters/)
+apply additionally.
+
+## Naming
+
+Metrics added directly by application or package code should have a unique name.
+This avoids collisions of metrics added via dependencies. They also clearly
+distinguish metrics collected with different semantics. This is solved through
+prefixes:
+
+```
+<component_name>_<metric>
+```
+
+For example, suppose the kubelet instrumented its HTTP requests but also uses
+an HTTP router providing its own implementation. Both expose metrics on total
+http requests. They should be distinguishable as in:
+
+```
+kubelet_http_requests_total{path=”/some/path”,status=”200”}
+routerpkg_http_requests_total{path=”/some/path”,status=”200”,method=”GET”}
+```
+
+As we can see they expose different labels and thus a naming collision would
+not have been possible to resolve even if both metrics counted the exact same
+requests.
+
+Resource objects that occur in names should inherit the spelling that is used
+in kubectl, i.e. daemon sets are `daemonset` rather than `daemon_set`.
+
+## Dimensionality & Cardinality
+
+Metrics can often replace more expensive logging as they are time-aggregated
+over a sampling interval. The [multidimensional data model](https://prometheus.io/docs/concepts/data_model/)
+enables deep insights and all metrics should use those label dimensions
+where appropriate.
+
+A common error that often causes performance issues in the ingesting metric
+system is considering dimensions that inhibit or eliminate time aggregation
+by being too specific. Typically those are user IDs or error messages.
+More generally: one should know a comprehensive list of all possible values
+for a label at instrumentation time.
+
+Notable exceptions are exporters like kube-state-metrics, which expose per-pod
+or per-deployment metrics, which are theoretically unbound over time as one could
+constantly create new ones, with new names. However, they have
+a reasonable upper bound for a given size of infrastructure they refer to and
+its typical frequency of changes.
+
+In general, “external” labels like pod or node name do not belong in the
+instrumentation itself. They are to be attached to metrics by the collecting
+system that has the external knowledge ([blog post](https://www.robustperception.io/target-labels-are-for-life-not-just-for-christmas/)).
+
+## Normalization
+
+Metrics should be normalized with respect to their dimensions. They should
+expose the minimal set of labels, each of which provides additional information.
+Labels that are composed from values of different labels are not desirable.
+For example:
+
+```
+example_metric{pod=”abc”,container=”proxy”,container_long=”abc/proxy”}
+```
+
+It often seems feasible to add additional meta information about an object
+to all metrics about that object, e.g.:
+
+```
+kube_pod_container_restarts{namespace=...,pod=...,container=...}
+```
+
+A common use case is wanting to look at such metrics w.r.t to the node the
+pod is scheduled on. So it seems convenient to add a “node” label.
+
+```
+kube_pod_container_restarts{namespace=...,pod=...,container=...,node=...}
+```
+
+This however only caters to one specific query use case. There are many more
+pieces of metadata that could be added, effectively blowing up the instrumentation.
+They are also not guaranteed to be stable over time. What if pods at some
+point can be live migrated?
+Those pieces of information should be normalized into an info-level metric
+([blog post](https://www.robustperception.io/exposing-the-software-version-to-prometheus/)),
+which is always set to 1. For example:
+
+```
+kube_pod_info{pod=...,namespace=...,pod_ip=...,host_ip=..,node=..., ...}
+```
+
+The metric system can later denormalize those along the identifying labels
+“pod” and “namespace” labels. This leads to...
+
+## Resource Referencing
+
+It is often desirable to correlate different metrics about a common object,
+such as a pod. Label dimensions can be used to match up different metrics.
+This is most easy if label names and values are following a common pattern.
+For metrics exposed by the same application, that often happens naturally.
+
+For a system composed of several independent, and also pluggable components,
+it makes sense to set cross-component standards to allow easy querying in
+metric systems without extensive post-processing of data.
+In Kubernetes, those are the resource objects such as deployments,
+pods, or services and the namespace they belong to.
+
+The following should be consistently used:
+
+```
+example_metric_ccc{pod=”example-app-5378923”, namespace=”default”}
+```
+
+An object is referenced by its unique name in a label named after the resource
+itself (i.e. `pod`/`deployment`/... and not `pod_name`/`deployment_name`)
+and the namespace it belongs to in the `namespace` label.
+
+Note: namespace/name combinations are only unique at a certain point in time.
+For time series this is given by the timestamp associated with any data point.
+UUIDs are truly unique but not convenient to use in user-facing time series
+queries.
+They can still be incorporated using an info level metric as described above for
+`kube_pod_info`. A query to a metric system selecting by UUID via a the info level
+metric could look as follows:
+
+```
+kube_pod_restarts and on(namespace, pod) kube_pod_info{uuid=”ABC”}
+```
+
diff --git a/contributors/devel/sig-instrumentation/logging.md b/contributors/devel/sig-instrumentation/logging.md
new file mode 100644
index 00000000..c4da6829
--- /dev/null
+++ b/contributors/devel/sig-instrumentation/logging.md
@@ -0,0 +1,34 @@
+## Logging Conventions
+
+The following conventions for the klog levels to use.
+[klog](http://godoc.org/github.com/kubernetes/klog) is globally preferred to
+[log](http://golang.org/pkg/log/) for better runtime control.
+
+* klog.Errorf() - Always an error
+
+* klog.Warningf() - Something unexpected, but probably not an error
+
+* klog.Infof() has multiple levels:
+ * klog.V(0) - Generally useful for this to ALWAYS be visible to an operator
+ * Programmer errors
+ * Logging extra info about a panic
+ * CLI argument handling
+ * klog.V(1) - A reasonable default log level if you don't want verbosity.
+ * Information about config (listening on X, watching Y)
+ * Errors that repeat frequently that relate to conditions that can be corrected (pod detected as unhealthy)
+ * klog.V(2) - Useful steady state information about the service and important log messages that may correlate to significant changes in the system. This is the recommended default log level for most systems.
+ * Logging HTTP requests and their exit code
+ * System state changing (killing pod)
+ * Controller state change events (starting pods)
+ * Scheduler log messages
+ * klog.V(3) - Extended information about changes
+ * More info about system state changes
+ * klog.V(4) - Debug level verbosity
+ * Logging in particularly thorny parts of code where you may want to come back later and check it
+ * klog.V(5) - Trace level verbosity
+ * Context to understand the steps leading up to errors and warnings
+ * More information for troubleshooting reported issues
+
+As per the comments, the practical default level is V(2). Developers and QE
+environments may wish to run at V(3) or V(4). If you wish to change the log
+level, you can pass in `-v=X` where X is the desired maximum level to log.
diff --git a/contributors/guide/coding-conventions.md b/contributors/guide/coding-conventions.md
index 63cc18ce..ebabbcbf 100644
--- a/contributors/guide/coding-conventions.md
+++ b/contributors/guide/coding-conventions.md
@@ -61,7 +61,7 @@ following Go conventions - `stateLock`, `mapLock` etc.
- [Kubectl conventions](/contributors/devel/kubectl-conventions.md)
- - [Logging conventions](/contributors/devel/logging.md)
+ - [Logging conventions](/contributors/devel/sig-instrumentation/logging.md)
## Testing conventions
diff --git a/sig-instrumentation/charter.md b/sig-instrumentation/charter.md
index d767a706..b5cd7643 100644
--- a/sig-instrumentation/charter.md
+++ b/sig-instrumentation/charter.md
@@ -69,5 +69,5 @@ By SIG Technical Leads
[sig-node]: https://github.com/kubernetes/community/tree/master/sig-node
[sigs.yaml]: https://github.com/kubernetes/community/blob/master/sigs.yaml#L964-L1018
[Kubernetes Charter README]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/README.md
-[instrumenting-kubernetes]: https://github.com/kubernetes/community/blob/master/contributors/devel/instrumentation.md
+[instrumenting-kubernetes]: /contributors/devel/sig-instrumentation/instrumentation.md
[core-metrics-pipeline]: https://kubernetes.io/docs/tasks/debug-application-cluster/core-metrics-pipeline/