Merge pull request #3143 from pivotal-k8s/contrib-test-debug

Detail how to debug CI failures as a contributor
author: Kubernetes Prow Robot <k8s-ci-robot@users.noreply.github.com> 2019-02-22 12:35:51 -0800
committer: GitHub <noreply@github.com> 2019-02-22 12:35:51 -0800
commit: 7efa0627cea7f654d459efccf07d0930f3289371 (patch)
tree: d2e5a6a773fd32106197d9f17461438ad9e3b96c /contributors
parent: 2471f165ae865c757a935f35a22fe3b65cd1d5d7 (diff)
parent: 286e22dbf9a92d44c480bba7028947b142195c68 (diff)
1 files changed, 74 insertions, 0 deletions
diff --git a/contributors/devel/sig-testing/testing.md b/contributors/devel/sig-testing/testing.md
index f35c516f..f6877da4 100644
--- a/contributors/devel/sig-testing/testing.md
+++ b/contributors/devel/sig-testing/testing.md
@@ -197,3 +197,77 @@ Please refer to [Integration Testing in Kubernetes](integration-tests.md).
 ## End-to-End tests
 
 Please refer to [End-to-End Testing in Kubernetes](e2e-tests.md).
+
+## Running your contribution through Kubernetes CI
+Once you open a PR, [`prow`][prow-url] runs pre-submit tests in CI. You can find more about `prow` in [kubernetes/test-infra][prow-git] and in [this blog post][prow-doc] on automation involved in testing PRs to Kubernetes.
+
+If you are not a [Kubernetes org member][membership], another org member will need to run [`/ok-to-test`][ok-to-test] on your PR.
+
+Find out more about [other commands][prow-cmds] you can use to interact with prow through GitHub comments.
+
+### Troubleshooting a failure
+Click on `Details` to look at artifacts produced by the test and the cluster under test, to help you debug the failure. These artifacts include:
+- test results
+- metadata on the test run (including versions of binaries used, test duration)
+- output from tests that have failed
+- build log showing the full test run
+- logs from the cluster under test (k8s components such as kubelet and apiserver, possibly other logs such as etcd and kernel)
+- junit xml files
+- test coverage files
+
+If the failure seems unrelated to the change you're submitting:
+- Is it a flake?
+  - Check if a GitHub issue is already open for that flake
+    - If not, open a new one (like [this example][new-issue-example]) and [label it `kind/flake`][kind/flake]
+    - If yes, any help troubleshooting and resolving it is very appreciated. Look at [Helping with known flakes](#helping-with-known-flakes) for how to do it.
+  - Run [`/retest`][retest] on your PR to re-trigger the tests
+
+- Is it a failure that shouldn't be happening (in other words; is the test expectation now wrong)?
+  - Get in touch with the SIG that your PR is labeled after
+    - preferably as a comment on your PR, by tagging the [GitHub team][k-teams] (for example a [reviewers team for the SIG][k-teams-review])
+    - write your reasoning as to why you think the test is now outdated and should be changed
+    - if you don't get a response in 24 hours, engage with the SIG on their channel on the [Kubernetes slack](http://slack.k8s.io/) and/or attend one of the [SIG meetings][sig-meetings] to ask for input.
+
+[prow-url]: https://prow.k8s.io
+[prow-git]: https://git.k8s.io/test-infra/prow
+[prow-doc]: https://kubernetes.io/blog/2018/08/29/the-machines-can-do-the-work-a-story-of-kubernetes-testing-ci-and-automating-the-contributor-experience/#enter-prow
+[membership]: https://github.com/kubernetes/community/blob/master/community-membership.md#member
+[k-teams]: https://github.com/orgs/kubernetes/teams
+[k-teams-review]: https://github.com/orgs/kubernetes/teams?utf8=%E2%9C%93&query=review
+[ok-to-test]: https://prow.k8s.io/command-help#ok_to_test
+[prow-cmds]: https://prow.k8s.io/command-help
+[retest]: https://prow.k8s.io/command-help#retest
+[new-issue-example]: https://github.com/kubernetes/kubernetes/issues/71430
+[kind/flake]: https://prow.k8s.io/command-help#kind
+[sig-meetings]: https://github.com/kubernetes/community/blob/master/sig-list.md
+
+#### Helping with known flakes
+For known flakes (i.e. with open GitHub issues against them), the community deeply values help in troubleshooting and resolving them. Starting points could be:
+- add logs from the failed run you experienced, and any other context to the existing discussion
+- if you spot a pattern or identify a root cause, notify or collaborate with the SIG that owns that area to resolve them
+
+#### Escalating failures to a SIG
+- Figure out corresponding SIG from test name/description
+- Mention the SIG's GitHub handle on the issue, optionally `cc` the SIG's chair(s) (locate them under kubernetes/community/sig-<name\>)
+- Optionally (or if you haven't heard back on the issue after 24h) reach out to the SIG on slack
+
+### Testgrid
+[`testgrid`](https://testgrid.k8s.io/) is a visualization of the Kubernetes CI status.
+
+It is useful as a way to:
+- see the run history of a test you are debugging (access it starting from a gubernator report for that test)
+- get an overview of the project's general health
+
+`testgrid` is organised in:
+- tests
+  - collection of assertions in a test file
+  - each test is typically owned by a single SIG
+  - each test is represented as a row on the grid
+- jobs
+  - collection of tests
+  - each job is typically owned by a single SIG
+  - each job is represented as a tab
+- dashboards
+  - collection of jobs
+  - each dashboard is represented as a button
+  - some dashboards collect jobs/tests in the domain of a specific SIG (named after and owned by those SIGs), and dashboards to monitor project wide health (owned by SIG-release)
author	Kubernetes Prow Robot <k8s-ci-robot@users.noreply.github.com>	2019-02-22 12:35:51 -0800
committer	GitHub <noreply@github.com>	2019-02-22 12:35:51 -0800
commit	7efa0627cea7f654d459efccf07d0930f3289371 (patch)
tree	d2e5a6a773fd32106197d9f17461438ad9e3b96c /contributors
parent	2471f165ae865c757a935f35a22fe3b65cd1d5d7 (diff)
parent	286e22dbf9a92d44c480bba7028947b142195c68 (diff)