diff options
| author | Paul Morie <pmorie@redhat.com> | 2015-12-02 23:56:13 -0500 |
|---|---|---|
| committer | Paul Morie <pmorie@redhat.com> | 2015-12-14 12:38:44 -0500 |
| commit | cd70be19ddd8803c66bc237a443eae9ceead286b (patch) | |
| tree | 4270d7ab0128f2be9ac8ecfb24ff29a3e1e4cd1a | |
| parent | 3e67cf00c23020d0af3618a6566955fb44b85187 (diff) | |
Proposal: internalize ownership management of volumes into plugins
| -rw-r--r-- | volume-ownership-management.md | 141 |
1 files changed, 141 insertions, 0 deletions
diff --git a/volume-ownership-management.md b/volume-ownership-management.md new file mode 100644 index 00000000..8dd4b8bb --- /dev/null +++ b/volume-ownership-management.md @@ -0,0 +1,141 @@ +<!-- BEGIN MUNGE: UNVERSIONED_WARNING --> + +<!-- BEGIN STRIP_FOR_RELEASE --> + +<img src="http://kubernetes.io/img/warning.png" alt="WARNING" + width="25" height="25"> +<img src="http://kubernetes.io/img/warning.png" alt="WARNING" + width="25" height="25"> +<img src="http://kubernetes.io/img/warning.png" alt="WARNING" + width="25" height="25"> +<img src="http://kubernetes.io/img/warning.png" alt="WARNING" + width="25" height="25"> +<img src="http://kubernetes.io/img/warning.png" alt="WARNING" + width="25" height="25"> + +<h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2> + +If you are using a released version of Kubernetes, you should +refer to the docs that go with that version. + +<strong> +The latest release of this document can be found +[here](http://releases.k8s.io/release-1.1/docs/proposals/volume-ownership-management.md). + +Documentation for other releases can be found at +[releases.k8s.io](http://releases.k8s.io). +</strong> +-- + +<!-- END STRIP_FOR_RELEASE --> + +<!-- END MUNGE: UNVERSIONED_WARNING --> + +## Volume plugins and idempotency + +Currently, volume plugins have a `SetUp` method which is called in the context of a higher-level +workflow within the kubelet which has externalized the problem of managing the ownership of volumes. +This design has a number of drawbacks that can be mitigated by completely internalizing all concerns +of volume setup behind the volume plugin `SetUp` method. + +### Known issues with current externalized design + +1. The ownership management is currently repeatedly applied, which breaks packages that require + special permissions in order to work correctly +2. There is a gap between files being mounted/created by volume plugins and when their ownership + is set correctly; race conditions exist around this +3. Solving the correct application of ownership management in an externalized model is difficult + and makes it clear that the a transaction boundary is being broken by the externalized design + +### Additional issues with externalization + +Fully externalizing any one concern of volumes is difficult for a number of reasons: + +1. Many types of idempotence checks exist, and are used in a variety of combinations and orders +2. Workflow in the kubelet becomes much more complex to handle: + 1. composition of plugins + 2. correct timing of application of ownership management + 3. callback to volume plugins when we know the whole `SetUp` flow is complete and correct + 4. callback to touch sentinel files + 5. etc etc +3. We want to support fully external volume plugins -- would require complex orchestration / chatty + remote API + +## Proposed implementation + +Since all of the ownership information is known in advance of the call to the volume plugin `SetUp` +method, we can easily internalize these concerns into the volume plugins and pass the ownership +information to `SetUp`. + +The volume `Builder` interface's `SetUp` method changes to accept the group that should own the +volume. Plugins become responsible for ensuring that the correct group is applied. The volume +`Attributes` struct can be modified to remove the `SupportsOwnershipManagement` field. + +```go +package volume + +type Builder interface { + // other methods omitted + + // SetUp prepares and mounts/unpacks the volume to a self-determined + // directory path and returns an error. The group ID that should own the volume + // is passed as a parameter. Plugins may choose to ignore the group ID directive + // in the event that they do not support it (example: NFS). A group ID of -1 + // indicates that the group ownership of the volume should not be modified by the plugin. + // + // SetUp will be called multiple times and should be idempotent. + SetUp(gid int64) error +} +``` + +Each volume plugin will have to change to support the new `SetUp` signature. The existing +ownership management code will be refactored into a library that volume plugins can use: + +``` +package volume + +func ManageOwnership(path string, fsGroup int64) error { + // 1. recursive chown of path + // 2. make path +setgid +} +``` + +The workflow from the Kubelet's perspective for handling volume setup and refresh becomes: + +```go +// go-ish pseudocode +func mountExternalVolumes(pod) error { + podVolumes := make(kubecontainer.VolumeMap) + for i := range pod.Spec.Volumes { + volSpec := &pod.Spec.Volumes[i] + var fsGroup int64 = 0 + if pod.Spec.SecurityContext != nil && + pod.Spec.SecurityContext.FSGroup != nil { + fsGroup = *pod.Spec.SecurityContext.FSGroup + } else { + fsGroup = -1 + } + + // Try to use a plugin for this volume. + plugin := volume.NewSpecFromVolume(volSpec) + builder, err := kl.newVolumeBuilderFromPlugins(plugin, pod) + if err != nil { + return err + } + if builder == nil { + return errUnsupportedVolumeType + } + + err := builder.SetUp(fsGroup) + if err != nil { + return nil + } + } + + return nil +} +``` + +<!-- BEGIN MUNGE: GENERATED_ANALYTICS --> +[]() +<!-- END MUNGE: GENERATED_ANALYTICS --> |
