summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorKubernetes Submit Queue <k8s-merge-robot@users.noreply.github.com>2016-09-28 08:11:43 -0700
committerGitHub <noreply@github.com>2016-09-28 08:11:43 -0700
commitff8bd8019a0d555d9b9b5c3b8a701b814e637575 (patch)
treeff1cbee5d45aeb0484e61c7ad9759bf2264f99d4
parentb36e717701d31b6f788142d2969edf00f966be85 (diff)
parentf760ac6f125585678b12a3f2bc623c9314b45e59 (diff)
Merge pull request #33571 from pmorie/selinux-docs
Automatic merge from submit-queue Move SELinux proposal to docs/design Moves the proposal into the docs/design directory, as should have happened long ago.
-rw-r--r--selinux.md348
1 files changed, 0 insertions, 348 deletions
diff --git a/selinux.md b/selinux.md
deleted file mode 100644
index 7865263e..00000000
--- a/selinux.md
+++ /dev/null
@@ -1,348 +0,0 @@
-<!-- BEGIN MUNGE: UNVERSIONED_WARNING -->
-
-<!-- BEGIN STRIP_FOR_RELEASE -->
-
-<img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING"
- width="25" height="25">
-<img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING"
- width="25" height="25">
-<img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING"
- width="25" height="25">
-<img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING"
- width="25" height="25">
-<img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING"
- width="25" height="25">
-
-<h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2>
-
-If you are using a released version of Kubernetes, you should
-refer to the docs that go with that version.
-
-<!-- TAG RELEASE_LINK, added by the munger automatically -->
-<strong>
-The latest release of this document can be found
-[here](http://releases.k8s.io/release-1.4/docs/proposals/selinux.md).
-
-Documentation for other releases can be found at
-[releases.k8s.io](http://releases.k8s.io).
-</strong>
---
-
-<!-- END STRIP_FOR_RELEASE -->
-
-<!-- END MUNGE: UNVERSIONED_WARNING -->
-
-## Abstract
-
-A proposal for enabling containers in a pod to share volumes using a pod level SELinux context.
-
-## Motivation
-
-Many users have a requirement to run pods on systems that have SELinux enabled. Volume plugin
-authors should not have to explicitly account for SELinux except for volume types that require
-special handling of the SELinux context during setup.
-
-Currently, each container in a pod has an SELinux context. This is not an ideal factoring for
-sharing resources using SELinux.
-
-We propose a pod-level SELinux context and a mechanism to support SELinux labeling of volumes in a
-generic way.
-
-Goals of this design:
-
-1. Describe the problems with a container SELinux context
-2. Articulate a design for generic SELinux support for volumes using a pod level SELinux context
- which is backward compatible with the v1.0.0 API
-
-## Constraints and Assumptions
-
-1. We will not support securing containers within a pod from one another
-2. Volume plugins should not have to handle setting SELinux context on volumes
-3. We will not deal with shared storage
-
-## Current State Overview
-
-### Docker
-
-Docker uses a base SELinux context and calculates a unique MCS label per container. The SELinux
-context of a container can be overridden with the `SecurityOpt` api that allows setting the different
-parts of the SELinux context individually.
-
-Docker has functionality to relabel bind-mounts with a usable SElinux and supports two different
-use-cases:
-
-1. The `:Z` bind-mount flag, which tells Docker to relabel a bind-mount with the container's
- SELinux context
-2. The `:z` bind-mount flag, which tells Docker to relabel a bind-mount with the container's
- SElinux context, but remove the MCS labels, making the volume shareable between containers
-
-We should avoid using the `:z` flag, because it relaxes the SELinux context so that any container
-(from an SELinux standpoint) can use the volume.
-
-### rkt
-
-rkt currently reads the base SELinux context to use from `/etc/selinux/*/contexts/lxc_contexts`
-and allocates a unique MCS label per pod.
-
-### Kubernetes
-
-
-There is a [proposed change](https://github.com/kubernetes/kubernetes/pull/9844) to the
-EmptyDir plugin that adds SELinux relabeling capabilities to that plugin, which is also carried as a
-patch in [OpenShift](https://github.com/openshift/origin). It is preferable to solve the problem
-in general of handling SELinux in kubernetes to merging this PR.
-
-A new `PodSecurityContext` type has been added that carries information about security attributes
-that apply to the entire pod and that apply to all containers in a pod. See:
-
-1. [Skeletal implementation](https://github.com/kubernetes/kubernetes/pull/13939)
-1. [Proposal for inlining container security fields](https://github.com/kubernetes/kubernetes/pull/12823)
-
-## Use Cases
-
-1. As a cluster operator, I want to support securing pods from one another using SELinux when
- SELinux integration is enabled in the cluster
-2. As a user, I want volumes sharing to work correctly amongst containers in pods
-
-#### SELinux context: pod- or container- level?
-
-Currently, SELinux context is specifiable only at the container level. This is an inconvenient
-factoring for sharing volumes and other SELinux-secured resources between containers because there
-is no way in SELinux to share resources between processes with different MCS labels except to
-remove MCS labels from the shared resource. This is a big security risk: _any container_ in the
-system can work with a resource which has the same SELinux context as it and no MCS labels. Since
-we are also not interested in isolating containers in a pod from one another, the SELinux context
-should be shared by all containers in a pod to facilitate isolation from the containers in other
-pods and sharing resources amongst all the containers of a pod.
-
-#### Volumes
-
-Kubernetes volumes can be divided into two broad categories:
-
-1. Unshared storage:
- 1. Volumes created by the kubelet on the host directory: empty directory, git repo, secret,
- downward api. All volumes in this category delegate to `EmptyDir` for their underlying
- storage.
- 2. Volumes based on network block devices: AWS EBS, iSCSI, RBD, etc, *when used exclusively
- by a single pod*.
-2. Shared storage:
- 1. `hostPath` is shared storage because it is necessarily used by a container and the host
- 2. Network file systems such as NFS, Glusterfs, Cephfs, etc.
- 3. Block device based volumes in `ReadOnlyMany` or `ReadWriteMany` modes are shared because
- they may be used simultaneously by multiple pods.
-
-For unshared storage, SELinux handling for most volumes can be generalized into running a `chcon` operation on the volume directory after running the volume plugin's `Setup` function. For these
-volumes, the Kubelet can perform the `chcon` operation and keep SELinux concerns out of the volume
-plugin code. Some volume plugins may need to use the SELinux context during a mount operation in
-certain cases. To account for this, our design must have a way for volume plugins to state that
-a particular volume should or should not receive generic label management.
-
-For shared storage, the picture is murkier. Labels for existing shared storage will be managed
-outside Kubernetes and administrators will have to set the SELinux context of pods correctly.
-The problem of solving SELinux label management for new shared storage is outside the scope for
-this proposal.
-
-## Analysis
-
-The system needs to be able to:
-
-1. Model correctly which volumes require SELinux label management
-1. Relabel volumes with the correct SELinux context when required
-
-### Modeling whether a volume requires label management
-
-#### Unshared storage: volumes derived from `EmptyDir`
-
-Empty dir and volumes derived from it are created by the system, so Kubernetes must always ensure
-that the ownership and SELinux context (when relevant) are set correctly for the volume to be
-usable.
-
-#### Unshared storage: network block devices
-
-Volume plugins based on network block devices such as AWS EBS and RBS can be treated the same way
-as local volumes. Since inodes are written to these block devices in the same way as `EmptyDir`
-volumes, permissions and ownership can be managed on the client side by the Kubelet when used
-exclusively by one pod. When the volumes are used outside of a persistent volume, or with the
-`ReadWriteOnce` mode, they are effectively unshared storage.
-
-When used by multiple pods, there are many additional use-cases to analyze before we can be
-confident that we can support SELinux label management robustly with these file systems. The right
-design is one that makes it easy to experiment and develop support for ownership management with
-volume plugins to enable developers and cluster operators to continue exploring these issues.
-
-#### Shared storage: hostPath
-
-The `hostPath` volume should only be used by effective-root users, and the permissions of paths
-exposed into containers via hostPath volumes should always be managed by the cluster operator. If
-the Kubelet managed the SELinux labels for `hostPath` volumes, a user who could create a `hostPath`
-volume could affect changes in the state of arbitrary paths within the host's filesystem. This
-would be a severe security risk, so we will consider hostPath a corner case that the kubelet should
-never perform ownership management for.
-
-#### Shared storage: network
-
-Ownership management of shared storage is a complex topic. SELinux labels for existing shared
-storage will be managed externally from Kubernetes. For this case, our API should make it simple to
-express whether a particular volume should have these concerns managed by Kubernetes.
-
-We will not attempt to address the concerns of new shared storage in this proposal.
-
-When a network block device is used as a persistent volume in `ReadWriteMany` or `ReadOnlyMany`
-modes, it is shared storage, and thus outside the scope of this proposal.
-
-#### API requirements
-
-From the above, we know that label management must be applied:
-
-1. To some volume types always
-2. To some volume types never
-3. To some volume types *sometimes*
-
-Volumes should be relabeled with the correct SELinux context. Docker has this capability today; it
-is desirable for other container runtime implementations to provide similar functionality.
-
-Relabeling should be an optional aspect of a volume plugin to accommodate:
-
-1. volume types for which generalized relabeling support is not sufficient
-2. testing for each volume plugin individually
-
-## Proposed Design
-
-Our design should minimize code for handling SELinux labelling required in the Kubelet and volume
-plugins.
-
-### Deferral: MCS label allocation
-
-Our short-term goal is to facilitate volume sharing and isolation with SELinux and expose the
-primitives for higher level composition; making these automatic is a longer-term goal. Allocating
-groups and MCS labels are fairly complex problems in their own right, and so our proposal will not
-encompass either of these topics. There are several problems that the solution for allocation
-depends on:
-
-1. Users and groups in Kubernetes
-2. General auth policy in Kubernetes
-3. [security policy](https://github.com/kubernetes/kubernetes/pull/7893)
-
-### API changes
-
-The [inline container security attributes PR (12823)](https://github.com/kubernetes/kubernetes/pull/12823)
-adds a `pod.Spec.SecurityContext.SELinuxOptions` field. The change to the API in this proposal is
-the addition of the semantics to this field:
-
-* When the `pod.Spec.SecurityContext.SELinuxOptions` field is set, volumes that support ownership
-management in the Kubelet have their SELinuxContext set from this field.
-
-```go
-package api
-
-type PodSecurityContext struct {
- // SELinuxOptions captures the SELinux context for all containers in a Pod. If a container's
- // SecurityContext.SELinuxOptions field is set, that setting has precedent for that container.
- //
- // This field will be used to set the SELinux of volumes that support SELinux label management
- // by the kubelet.
- SELinuxOptions *SELinuxOptions `json:"seLinuxOptions,omitempty"`
-}
-```
-
-The V1 API is extended with the same semantics:
-
-```go
-package v1
-
-type PodSecurityContext struct {
- // SELinuxOptions captures the SELinux context for all containers in a Pod. If a container's
- // SecurityContext.SELinuxOptions field is set, that setting has precedent for that container.
- //
- // This field will be used to set the SELinux of volumes that support SELinux label management
- // by the kubelet.
- SELinuxOptions *SELinuxOptions `json:"seLinuxOptions,omitempty"`
-}
-```
-
-#### API backward compatibility
-
-Old pods that do not have the `pod.Spec.SecurityContext.SELinuxOptions` field set will not receive
-SELinux label management for their volumes. This is acceptable since old clients won't know about
-this field and won't have any expectation of their volumes being managed this way.
-
-The existing backward compatibility semantics for SELinux do not change at all with this proposal.
-
-### Kubelet changes
-
-The Kubelet should be modified to perform SELinux label management when required for a volume. The
-criteria to activate the kubelet SELinux label management for volumes are:
-
-1. SELinux integration is enabled in the cluster
-2. SELinux is enabled on the node
-3. The `pod.Spec.SecurityContext.SELinuxOptions` field is set
-4. The volume plugin supports SELinux label management
-
-The `volume.Mounter` interface should have a new method added that indicates whether the plugin
-supports SELinux label management:
-
-```go
-package volume
-
-type Builder interface {
- // other methods omitted
- SupportsSELinux() bool
-}
-```
-
-Individual volume plugins are responsible for correctly reporting whether they support label
-management in the kubelet. In the first round of work, only `hostPath` and `emptyDir` and its
-derivations will be tested with ownership management support:
-
-| Plugin Name | SupportsOwnershipManagement |
-|-------------------------|-------------------------------|
-| `hostPath` | false |
-| `emptyDir` | true |
-| `gitRepo` | true |
-| `secret` | true |
-| `downwardAPI` | true |
-| `gcePersistentDisk` | false |
-| `awsElasticBlockStore` | false |
-| `nfs` | false |
-| `iscsi` | false |
-| `glusterfs` | false |
-| `persistentVolumeClaim` | depends on underlying volume and PV mode |
-| `rbd` | false |
-| `cinder` | false |
-| `cephfs` | false |
-
-Ultimately, the matrix will theoretically look like:
-
-| Plugin Name | SupportsOwnershipManagement |
-|-------------------------|-------------------------------|
-| `hostPath` | false |
-| `emptyDir` | true |
-| `gitRepo` | true |
-| `secret` | true |
-| `downwardAPI` | true |
-| `gcePersistentDisk` | true |
-| `awsElasticBlockStore` | true |
-| `nfs` | false |
-| `iscsi` | true |
-| `glusterfs` | false |
-| `persistentVolumeClaim` | depends on underlying volume and PV mode |
-| `rbd` | true |
-| `cinder` | false |
-| `cephfs` | false |
-
-In order to limit the amount of SELinux label management code in Kubernetes, we propose that it be a
-function of the container runtime implementations. Initially, we will modify the docker runtime
-implementation to correctly set the `:Z` flag on the appropriate bind-mounts in order to accomplish
-generic label management for docker containers.
-
-Volume types that require SELinux context information at mount must be injected with and respect the
-enablement setting for the labeling for the volume type. The proposed `VolumeConfig` mechanism
-will be used to carry information about label management enablement to the volume plugins that have
-to manage labels individually.
-
-This allows the volume plugins to determine when they do and don't want this type of support from
-the Kubelet, and allows the criteria each plugin uses to evolve without changing the Kubelet.
-
-<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
-[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/proposals/selinux.md?pixel)]()
-<!-- END MUNGE: GENERATED_ANALYTICS -->