summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorKubernetes Submit Queue <k8s-merge-robot@users.noreply.github.com>2017-11-15 15:05:51 -0800
committerGitHub <noreply@github.com>2017-11-15 15:05:51 -0800
commit75cf4f0b3be73c44d6d729d979d67ae685a65af5 (patch)
tree6652ba961489fe4a2e2f56ae5ea2d201c4ffacc3
parent4fed796330f3866603dbdec56eaef7d780a4f117 (diff)
parent9fb713783671b7e5974450804ed960b645f47899 (diff)
Merge pull request #1325 from perotinus/projectplan
Automatic merge from submit-queue. Migrate the cluster registry design and plan to the community repo /cc @csbell @quinton-hoole This is a mechanical conversion of the [Cluster registry project plan doc](https://docs.google.com/document/d/1bVvq9lDIbE-Glyr6GkSGWYkLb2cCNk9bR8LL7Wm-L6g/edit), with a few tweaks to better suit the format and to note the provenance of this document. Quinton and Christian, I'd like both of you to sign-off on this before I commit. Anyone else is free to chime in if they have concerns, but I believe that we achieved consensus on this doc, and I'd like to formalize that. @font @pmorie @ericchiang
-rw-r--r--contributors/design-proposals/multicluster/cluster-registry/project-design-and-plan.md335
1 files changed, 335 insertions, 0 deletions
diff --git a/contributors/design-proposals/multicluster/cluster-registry/project-design-and-plan.md b/contributors/design-proposals/multicluster/cluster-registry/project-design-and-plan.md
new file mode 100644
index 00000000..09685ffd
--- /dev/null
+++ b/contributors/design-proposals/multicluster/cluster-registry/project-design-and-plan.md
@@ -0,0 +1,335 @@
+# Cluster registry design and plan
+
+@perotinus
+
+Updated: 11/2/17
+
+*REVIEWED* in SIG-multicluster meeting on 10/24
+
+*This doc is a Markdown conversion of the original Cluster registry design and
+plan
+[Google doc](https://docs.google.com/document/d/1bVvq9lDIbE-Glyr6GkSGWYkLb2cCNk9bR8LL7Wm-L6g/edit).
+That doc is deprecated, and this one is canonical; however, the old doc will be
+preserved so as not to lose comment and revision history that it contains.*
+
+## Table of Contents
+
+- [Background](#background)
+- [Goal](#goal)
+- [Technical requirements](#technical-requirements)
+ - [Alpha](#alpha)
+ - [Beta](#beta)
+ - [Later](#later)
+- [Implementation design](#implementation-design)
+ - [Alternatives](#alternatives)
+ - [Using a CRD](#using-a-crd)
+- [Tooling design](#tooling-design)
+ - [User tooling](#user-tooling)
+- [Repository process](#repository-process)
+- [Release strategy](#release-strategy)
+ - [Version skew](#version-skew)
+- [Test strategy](#test-strategy)
+- [Milestones and timelines](#milestones-and-timelines)
+ - [Alpha (targeting late Q4 '17)](#alpha-targeting-late-q4-'17)
+ - [Beta (targeting mid Q1 '18)](#beta-targeting-mid-q1-'18)
+ - [Stable (targeting mid Q2 '18)](#stable-targeting-mid-q2-'18)
+ - [Later](#later)
+
+## Background
+
+SIG-multicluster has identified a cluster registry as being a key enabling
+component for multi-cluster use cases. The goal of the SIG in this project is
+that the API defined for the cluster registry become a standard for
+multi-cluster tools. The API design is being discussed in
+[a separate doc](https://drive.google.com/a/google.com/open?id=1Oi9EO3Jwtp69obakl-9YpLkP764GZzsz95XJlX1a960).
+A working prototype of the cluster registry has been assembled in
+[a new repository](https://github.com/kubernetes/cluster-registry).
+
+## Goal
+
+This document intends to lay out an initial plan for moving the cluster registry
+from the prototype state through alpha, beta and eventually to a stable release.
+
+## Technical requirements
+
+These requirements are derived mainly from the output of the
+[August 9th multi-cluster SIG meeting](https://docs.google.com/document/d/11cB3HK67BZUb7aNOCpQK8JsA8Na8F-F_6bFu77KrKT4/edit#heading=h.cuvqls7pl9qc).
+However, they also derive (at least indirectly) from the results of the
+[SIG F2F meeting ](https://docs.google.com/document/d/1HkVBSm9L9UJC2f3wfs_8zt1PJmv6iepdtJ2fmkCOHys/edit)
+that took place earlier, as well as the use cases presented by various parties
+at that meeting.
+
+### Alpha
+
+- [API] Provides an HTTP server with a Kubernetes-style API for CRUD
+ operations on a registry of clusters
+ - Provides a way to filter across clusters by label
+ - Supports annotations
+- [API] Provides information about each cluster's API server's
+ location/endpoint
+- [API] Provides the ability to assign user-friendly names for clusters
+- [API] Provides pointers to authentication information for clusters' API
+ servers
+ - This implies that it does not support storing credentials directly
+- [Implementation] Supports both independent and aggregated deployment models
+ - Supports delegated authn/authz in aggregated mode
+ - Supports integration into Federation via aggregated deployment model
+
+### Beta
+
+- Supports providing a flat file of clusters for storage and later use
+ - This may be provided by the ecosystem rather than the registry API
+ implementation directly
+- Supports independent authz for reading/writing clusters
+- `kubectl` integration with the cluster registry API server is first-class
+ and on par with kubectl integration with core Kubernetes APIs
+- Supports grouping of clusters
+- Supports specifying and enforcing read and/or write authorization for groups
+ of clusters in the registry
+- Working Federation integration
+- Supports status from active controllers
+
+### Later
+
+- Supports an HA deployment strategy
+- Supports guarantees around immutability/identity of clusters in list
+- Version skew between various components is understood and supported skews
+ are defined and tested
+
+## Implementation design
+
+The cluster registry will be implemented using the
+[Kubernetes API machinery](https://github.com/kubernetes/apimachinery). The
+cluster registry API server will be a fork and rework of the existing Federation
+API server, scaled down and simplified to match the simpler set of requirements
+for the cluster registry. It will use the
+[apiserver](https://github.com/kubernetes/apiserver) library, plus some code
+copied from the core Kubernetes repo that provides scaffolding for certain
+features in the API server. This is currently implemented in a prototype form
+in the [cluster-registry repo](https://github.com/kubernetes/cluster-registry).
+
+The API will be implemented using the Kubernetes API machinery, as a new API
+with two objects, `Cluster` and `ClusterList`. Other APIs may be added in the
+future to support future use cases, but the intention is that the cluster
+registry API server remain minimal and only provide the APIs that users of a
+cluster registry would want for the cluster registry.
+
+The cluster registry will not be responsible for storing secrets. It will
+contain pointers to other secret stores which will need to be implemented
+independently. The cluster registry API will not provide proxy access to
+clusters, and will not interact with clusters on a user's behalf. Storing secret
+information in the cluster registry will be heavily discouraged by its
+documentation, and will be considered a misuse of the registry. This allows us
+to sidestep the complexity of implementing a secure credential storage.
+
+The expectation is that Federation and other programs will use the
+cluster-registry as an aggregated API server rather than via direct code
+integration. Therefore, the cluster registry will explicitly support being
+deployed as an API server that can be aggregated by other API servers.
+
+### Alternatives
+
+#### Using a CRD
+
+The cluster registry could be implemented as a CRD that is registered with a
+Kubernetes API server. This implementation is lighter weight than running a full
+API server. If desired, the administrator could then disable the majority of the
+Kubernetes APIs for non-admin users and so make it appear as if the API server
+only supports cluster objects. It should be possible for a user to migrate
+without much effort from a CRD-based to an API-server-based implementation of
+the cluster registry, but the cluster-registry project is not currently planning
+to spend time supporting this use case. CRDs do not support (and may never
+support) versioning, which is very desirable for the cluster registry API. Users
+who wish to use a CRD implementation will have to design and maintain it
+themselves.
+
+## Tooling design
+
+### User tooling
+
+In the alpha stage, the cluster registry will repurpose the kubefed tool from
+Federation and use it to initialize a cluster registry. In early stages, this
+will only create a deployment with one replica, running the API server and etcd
+in a `Pod`. As the HA requirements for the cluster registry are fleshed out,
+this tool may need to be updated or replaced to support deploying a cluster
+registry in multiple clusters and with multiple etcd replicas.
+
+Since the cluster registry is a Kubernetes API server that serves a custom
+resource type, it will be usable by `kubectl`. We expect that the kubectl
+experience for custom APIs will soon be on par with that of core Kubernetes
+APIs, since there has been a significant investment in making `kubectl` provide
+very detailed output from its describe subcommand; and `kubectl` 1.9 is expected
+to introduce API server-originated columns. Therefore, we will not initially
+implement any special tooling for interacting with the registry, and will tell
+users to use `kubectl` or generated client libraries.
+
+## Repository process
+
+The cluster registry is a top-level Kubernetes repository, and thus it requires
+some process to ensure stability for dependent projects and accountability.
+Since the Federation project wants to use the cluster registry instead of its
+current cluster API, there is a requirement to establish process even though the
+project is young and does not yet have a lot of contributors.
+
+The standard Kubernetes Prow bots have been enabled on the repo. There is
+currently some functionality around managing PR approval and reviewer assignment
+and such that does not yet live in Prow, but given the limited number of
+contributors at this point it seems reasonable to wait for sig-testing to
+implement this functionality in Prow rather than enabling the deprecated tools.
+In most cases where a process is necessary, we will defer to the spirit of the
+Kubernetes process (if not the letter) though we will modify it as necessary for
+the scope and scale of the cluster registry project. There is not yet a merge
+queue, and until there is a clear need for one we do not intend to add one.
+
+The code in the repository will use bazel as its build system. This is in-line
+with what the Kubernetes project is attempting to move towards, and since the
+cluster registry has a similar but more limited set of needs than Kubernetes, we
+expect that bazel will support our needs adequately. The structure that bazel
+uses is meant to be compatible with go tooling, so if necessary, we can migrate
+away from bazel in the future without having to entirely revamp the repository
+structure.
+
+There is not currently a good process in Kubernetes for keeping vendored
+dependencies up-to-date. The strategy we expect to take with the cluster
+registry is to update only when necessary, and to use the same dependency
+versions that Kubernetes uses unless there is some particular incompatibility.
+
+## Release strategy
+
+For early versions, the cluster registry release will consist of a container
+image and a client tool. The container image will contain the cluster registry
+API server, and the tool will be used to bootstrap this image plus an etcd image
+into a running cluster. The container will be published to GCR, and the client
+tool releases will be stored in a GCS bucket, following a pattern used by other
+k/ projects.
+
+For now, the release process will be managed mostly manually by repository
+maintainers. We will create a release branch that will be used for releases, and
+use GitHub releases along with some additional per-release documentation
+(`CHANGELOG`, etc). `CHANGELOG`s and release notes will be collected manually
+until the volume of work becomes too great. We do not intend to create multiple
+release branches until the project is more stable. Releases will undergo a
+more-detailed set of tests that will ensure compatibility with recent released
+versions of `kubectl` (for the registry) and Kubernetes (for the cluster
+registry as an aggregated API server). Having a well-defined release process
+will be a stable release requirement, and by that point we expect to have gained
+some practical experience that will make it easier to codify the requirements
+around doing releases. The cluster registry will use semantic versioning, but
+its versions will not map to Kubernetes versions. Cluster registry releases will
+not follow the Kubernetes release cycle, though Kubernetes releases may trigger
+cluster registry releases if there are compatibility issues that need to be
+fixed.
+
+Projects that want to vendor the cluster registry will be able to do so. We
+expect that these projects will vendor from the release branch if they want a
+stable version, or from a desired SHA if they are comfortable using a version
+that has not necessarily been fully vetted.
+
+As the project matures, we expect the tool to evolve (or be replaced) in order
+to support deployment against an existing etcd instance (potentially provided by
+an etcd operator), and to provide a HA story for hosting a cluster registry.
+This is considered future work and will not be addressed directly in this
+document.
+
+Cross-compilation support in bazel is still a work in progress, so the cluster
+registry will not be able to easily provide binary releases for every platform
+until this is supported by bazel. If it becomes necessary to provide
+cross-platform binaries before bazel cross-compilation is available, the
+repository is setup to support common go tooling, so we should be able to devise
+a process for doing so.
+
+### Version skew
+
+There are several participants in the cluster registry ecosystem whose versions
+will be conceptually able to float relative to each other:
+
+- The cluster registry API server
+- The host cluster's API server
+- `kubectl`
+
+We will need to define the version skew restraints between these components and
+ensure that our testing validates key skews that we care about.
+
+## Test strategy
+
+The cluster registry is a simple component, and so should be able to be tested
+extensively without too much difficulty. The API machinery code is already
+tested by its owning teams, and since the cluster registry is a straightforward
+API server, it should not require much testing of its own. The bulk of the tests
+written will be integration tests, to ensure that it runs correctly in
+aggregated and independent modes in a Kubernetes cluster; to verify that various
+versions of kubectl can interact with it; and to verify that it can be upgraded
+safely. A full testing strategy is a requirement for a GA launch; we expect
+development of a test suite to be an ongoing effort in the early stages of
+development.
+
+The command-line tool will require testing. It should be E2E tested against
+recent versions of Kubernetes, to ensure that a simple cluster registry can be
+created in a Kubernetes cluster. The multiplicity of configuration options it
+provides cannot conveniently be tested in E2E tests, and so will be validated in
+unit tests.
+
+## Milestones and timelines
+
+### Alpha (targeting late Q4 '17)
+
+- Test suite is running on each PR, with reasonable unit test coverage and
+ minimal integration/E2E testing
+- Repository processes (OWNERship, who can merge, review lag standards,
+ project planning, issue triaging, etc.) established
+- Contributor documentation written
+- User documentation drafted
+- Cluster registry API also alpha
+- All Alpha technical requirements met
+
+### Beta (targeting mid Q1 '18)
+
+- Full suite of integration/E2E tests running on each PR
+- API is beta
+- Preparatory tasks for GA
+- All Beta technical requirements met
+- User documentation complete and proofed, content-wise
+- Enough feedback solicited from users, or inferred from download
+ statistics/repository issues
+
+### Stable (targeting mid Q2 '18)
+
+- Fully fleshed-out user documentation
+- User documentation is published in a finalized location
+- First version of API is GA
+- Documented upgrade test procedure, with appropriate test tooling implemented
+- Plan for and documentation about Kubernetes version support (e.g., which
+ versions of Kubernetes a cluster registry can be a delegated API server
+ with)
+- Releases for all platforms
+- Well-defined and documented release process
+
+### Later
+
+- Documented approach to doing a HA deployment
+- Work on Later technical requirements
+
+## Questions
+
+- How does RBAC work in this case?
+- Generated code. How does it work? How do we make it clean? Do we check it
+ in?
+- Do we need/want an example client that uses the cluster registry for some
+ basic operations? A demo, as it were?
+- Labels vs. fields. Versioning? Do we graduate? Do we define a policy for
+ label names and have the server validate it?
+- Where should the user documentation for the cluster registry live?
+ kubernetes.io doesn't quite seem appropriate, but perhaps that's the right
+ place?
+- Is there a reasonable way to support a CRD-based implementation? Should this
+ project support it directly, or work to not prevent it from working, or
+ ignore it?
+
+## History
+
+| Date | Details |
+|--|--|
+| 10/9/17 | Initial draft |
+| 10/16/17 | Minor edits based on comments |
+| 10/19/17 | Added section specifically about version skew; change P0/P1/P2 to alpha/beta/stable; added some milestone requirements |
+| 11/2/17 | Resolved most comments and added minor tweaks to text in order to do so |