imported docs/devel as a subtree

author: Michelle Noorali <michelle@deis.com> 2016-11-30 14:43:43 -0500
committer: Michelle Noorali <michelle@deis.com> 2016-11-30 14:43:43 -0500
commit: 05d50ebda151b779f6aff8515f480de495ecb713 (patch)
tree: 396df914af2444cf24723247d05e197df5aaf638 /contributors/devel/node-performance-testing.md
parent: ec3349f0e26146b418da10128e14defae963766b (diff)
parent: 7b116eb6113bff36074cc2d06c10a39973b2610f (diff)
1 files changed, 127 insertions, 0 deletions
diff --git a/contributors/devel/node-performance-testing.md b/contributors/devel/node-performance-testing.md
new file mode 100644
index 00000000..d6bb657f
--- /dev/null
+++ b/contributors/devel/node-performance-testing.md
@@ -0,0 +1,127 @@
+# Measuring Node Performance
+
+This document outlines the issues and pitfalls of measuring Node performance, as
+well as the tools available.
+
+## Cluster Set-up
+
+There are lots of factors which can affect node performance numbers, so care
+must be taken in setting up the cluster to make the intended measurements. In
+addition to taking the following steps into consideration, it is important to
+document precisely which setup was used. For example, performance can vary
+wildly from commit-to-commit, so it is very important to **document which commit
+or version** of Kubernetes was used, which Docker version was used, etc.
+
+### Addon pods
+
+Be aware of which addon pods are running on which nodes. By default Kubernetes
+runs 8 addon pods, plus another 2 per node (`fluentd-elasticsearch` and
+`kube-proxy`) in the `kube-system` namespace. The addon pods can be disabled for
+more consistent results, but doing so can also have performance implications.
+
+For example, Heapster polls each node regularly to collect stats data. Disabling
+Heapster will hide the performance cost of serving those stats in the Kubelet.
+
+#### Disabling Add-ons
+
+Disabling addons is simple. Just ssh into the Kubernetes master and move the
+addon from `/etc/kubernetes/addons/` to a backup location. More details
+[here](../../cluster/addons/).
+
+### Which / how many pods?
+
+Performance will vary a lot between a node with 0 pods and a node with 100 pods.
+In many cases you'll want to make measurements with several different amounts of
+pods. On a single node cluster scaling a replication controller makes this easy,
+just make sure the system reaches a steady-state before starting the
+measurement. E.g. `kubectl scale replicationcontroller pause --replicas=100`
+
+In most cases pause pods will yield the most consistent measurements since the
+system will not be affected by pod load. However, in some special cases
+Kubernetes has been tuned to optimize pods that are not doing anything, such as
+the cAdvisor housekeeping (stats gathering). In these cases, performing a very
+light task (such as a simple network ping) can make a difference.
+
+Finally, you should also consider which features yours pods should be using. For
+example, if you want to measure performance with probing, you should obviously
+use pods with liveness or readiness probes configured. Likewise for volumes,
+number of containers, etc.
+
+### Other Tips
+
+**Number of nodes** - On the one hand, it can be easier to manage logs, pods,
+environment etc. with a single node to worry about. On the other hand, having
+multiple nodes will let you gather more data in parallel for more robust
+sampling.
+
+## E2E Performance Test
+
+There is an end-to-end test for collecting overall resource usage of node
+components: [kubelet_perf.go](../../test/e2e/kubelet_perf.go). To
+run the test, simply make sure you have an e2e cluster running (`go run
+hack/e2e.go -up`) and [set up](#cluster-set-up) correctly.
+
+Run the test with `go run hack/e2e.go -v -test
+--test_args="--ginkgo.focus=resource\susage\stracking"`. You may also wish to
+customise the number of pods or other parameters of the test (remember to rerun
+`make WHAT=test/e2e/e2e.test` after you do).
+
+## Profiling
+
+Kubelet installs the [go pprof handlers]
+(https://golang.org/pkg/net/http/pprof/), which can be queried for CPU profiles:
+
+```console
+$ kubectl proxy &
+Starting to serve on 127.0.0.1:8001
+$ curl -G "http://localhost:8001/api/v1/proxy/nodes/${NODE}:10250/debug/pprof/profile?seconds=${DURATION_SECONDS}" > $OUTPUT
+$ KUBELET_BIN=_output/dockerized/bin/linux/amd64/kubelet
+$ go tool pprof -web $KUBELET_BIN $OUTPUT
+```
+
+`pprof` can also provide heap usage, from the `/debug/pprof/heap` endpoint
+(e.g. `http://localhost:8001/api/v1/proxy/nodes/${NODE}:10250/debug/pprof/heap`).
+
+More information on go profiling can be found
+[here](http://blog.golang.org/profiling-go-programs).
+
+## Benchmarks
+
+Before jumping through all the hoops to measure a live Kubernetes node in a real
+cluster, it is worth considering whether the data you need can be gathered
+through a Benchmark test. Go provides a really simple benchmarking mechanism,
+just add a unit test of the form:
+
+```go
+// In foo_test.go
+func BenchmarkFoo(b *testing.B) {
+  b.StopTimer()
+  setupFoo() // Perform any global setup
+  b.StartTimer()
+  for i := 0; i < b.N; i++ {
+    foo() // Functionality to measure
+  }
+}
+```
+
+Then:
+
+```console
+$ go test -bench=. -benchtime=${SECONDS}s foo_test.go
+```
+
+More details on benchmarking [here](https://golang.org/pkg/testing/).
+
+## TODO
+
+- (taotao) Measuring docker performance
+- Expand cluster set-up section
+- (vishh) Measuring disk usage
+- (yujuhong) Measuring memory usage
+- Add section on monitoring kubelet metrics (e.g. with prometheus)
+
+
+
+<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
+[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/devel/node-performance-testing.md?pixel)]()
+<!-- END MUNGE: GENERATED_ANALYTICS -->
author	Michelle Noorali <michelle@deis.com>	2016-11-30 14:43:43 -0500
committer	Michelle Noorali <michelle@deis.com>	2016-11-30 14:43:43 -0500
commit	05d50ebda151b779f6aff8515f480de495ecb713 (patch)
tree	396df914af2444cf24723247d05e197df5aaf638 /contributors/devel/node-performance-testing.md
parent	ec3349f0e26146b418da10128e14defae963766b (diff)
parent	7b116eb6113bff36074cc2d06c10a39973b2610f (diff)