summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authork8s-ci-robot <k8s-ci-robot@users.noreply.github.com>2018-08-24 10:23:52 -0700
committerGitHub <noreply@github.com>2018-08-24 10:23:52 -0700
commit5bda15447e5238aa78e02265fef4004ee6d8a467 (patch)
tree2a7cb415f3b806a505c8f222d53dba88fbe1ea00
parentce3378e89806d897e827de75cdac3faa9cbee3fd (diff)
parentc234ca8c12cb66cbfe1d13a41e065d8329943fe3 (diff)
Merge pull request #2276 from nokia/k8s-sctp-kep-doc
New KEP for SCTP support feature
-rw-r--r--keps/sig-network/0015-20180614-SCTP-support.md292
1 files changed, 292 insertions, 0 deletions
diff --git a/keps/sig-network/0015-20180614-SCTP-support.md b/keps/sig-network/0015-20180614-SCTP-support.md
new file mode 100644
index 00000000..370ac92e
--- /dev/null
+++ b/keps/sig-network/0015-20180614-SCTP-support.md
@@ -0,0 +1,292 @@
+---
+kep-number: 15
+title: SCTP support
+authors:
+ - "@janosi"
+owning-sig: sig-network
+participating-sigs:
+ - sig-network
+reviewers:
+ - "@thockin"
+approvers:
+ - "@thockin"
+editor: TBD
+creation-date: 2018-06-14
+last-updated: 2018-06-22
+status: provisional
+see-also:
+ - PR64973
+replaces:
+superseded-by:
+---
+
+# SCTP support
+
+## Table of Contents
+
+* [Table of Contents](#table-of-contents)
+* [Summary](#summary)
+* [Motivation](#motivation)
+ * [Goals](#goals)
+ * [Non-Goals](#non-goals)
+* [Proposal](#proposal)
+ * [User Stories [optional]](#user-stories-optional)
+ * [Story 1](#story-1)
+ * [Story 2](#story-2)
+ * [Implementation Details/Notes/Constraints [optional]](#implementation-detailsnotesconstraints-optional)
+ * [Risks and Mitigations](#risks-and-mitigations)
+* [Graduation Criteria](#graduation-criteria)
+* [Implementation History](#implementation-history)
+* [Drawbacks [optional]](#drawbacks-optional)
+* [Alternatives [optional]](#alternatives-optional)
+
+## Summary
+
+The goal of the SCTP support feature is to enable the usage of the SCTP protocol in Kubernetes [Service][], [NetworkPolicy][], and [ContainerPort][]as an additional protocol value option beside the current TCP and UDP options.
+SCTP is an IETF protocol specified in [RFC4960][], and it is used widely in telecommunications network stacks.
+Once SCTP support is added as a new protocol option those applications that require SCTP as L4 protocol on their interfaces can be deployed on Kubernetes clusters on a more straightforward way. For example they can use the native kube-dns based service discvery, and their communication can be controlled on the native NetworkPolicy way.
+
+[Service]: https://kubernetes.io/docs/concepts/services-networking/service/
+[NetworkPolicy]:
+https://kubernetes.io/docs/concepts/services-networking/network-policies/
+[ContainerPort]:https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/#exposing-pods-to-the-cluster
+[RFC4960]: https://tools.ietf.org/html/rfc4960
+
+
+## Motivation
+
+SCTP is a widely used protocol in telecommunications. It would ease the management and execution of telecommunication applications on Kubernetes if SCTP were added as a protocol option to Kubernetes.
+
+### Goals
+
+Add SCTP support to Kubernetes ContainerPort, Service and NetworkPolicy, so applications running in pods can use the native kube-dns based service discovery for SCTP based services, and their communication can be controlled via the native NetworkPolicy way.
+
+It is also a goal to enable ingress SCTP connections from clients outside the Kubernetes cluster, and to enable egress SCTP connections to servers outside the Kubernetes cluster.
+
+### Non-Goals
+
+It is not a goal here to add SCTP support to load balancers that are provided by cloud providers. The Kubernetes side implementation will not restrict the usage of SCTP as the protocol for the Services with type=LoadBalancer, but we do not implement the support of SCTP into the cloud specific load balancer implementations.
+
+It is not a goal to support multi-homed SCTP associations. Such a support also depends on the ability to manage multiple IP addresses for a pod, and in the case of Services with ClusterIP or NodePort the support of multi-homed assocations would also require the support of NAT for multihomed associations in the SCTP related NF conntrack modules.
+
+## Proposal
+
+### User Stories [optional]
+
+#### Service with SCTP and Virtual IP
+As a user of Kubernetes I want to define Services with Virtual IPs for my applications that use SCTP as L4 protocol on their interfaces,so client applications can use the services of my applications on top of SCTP via that Virtual IP.
+
+Example:
+```
+kind: Service
+apiVersion: v1
+metadata:
+ name: my-service
+spec:
+ selector:
+ app: MyApp
+ ports:
+ - protocol: SCTP
+ port: 80
+ targetPort: 9376
+```
+
+#### Headless Service with SCTP
+As a user of Kubernetes I want to define headless Services for my applications that use SCTP as L4 protocol on their interfaces, so client applications can discover my applications in kube-dns, or via any other service discovery methods that get information about endpoints via the Kubernetes API.
+
+Example:
+```
+kind: Service
+apiVersion: v1
+metadata:
+ name: my-service
+spec:
+ selector:
+ app: MyApp
+ ClusterIP: "None"
+ ports:
+ - protocol: SCTP
+ port: 80
+ targetPort: 9376
+```
+#### Service with SCTP without selector
+As a user of Kubernetes I want to define Services without selector for my applications that use SCTP as L4 protocol on their interfaces, so I can implement my own service controllers if I want to extend the basic functionality of Kubernetes.
+
+Example:
+```
+kind: Service
+apiVersion: v1
+metadata:
+ name: my-service
+spec:
+ ClusterIP: "None"
+ ports:
+ - protocol: SCTP
+ port: 80
+ targetPort: 9376
+```
+
+#### SCTP as container port protocol in Pod definition
+As a user of Kubernetes I want to define hostPort for the SCTP based interfaces of my applications
+Example:
+```
+apiVersion: v1
+kind: Pod
+metadata:
+ name: mypod
+spec:
+ containers:
+ - name: container-1
+ image: mycontainerimg
+ ports:
+ - name: diameter
+ protocol: SCTP
+ containerPort: 80
+ hostPort: 80
+```
+
+#### SCTP port accessible from outside the cluster
+
+As a user of Kubernetes I want to have the option that clien applications that reside outside of the cluster can access my SCTP based services that run in the cluster.
+
+Example:
+```
+kind: Service
+apiVersion: v1
+metadata:
+ name: my-service
+spec:
+ type: NodePort
+ selector:
+ app: MyApp
+ ports:
+ - protocol: SCTP
+ port: 80
+ targetPort: 9376
+```
+
+Example:
+```
+kind: Service
+apiVersion: v1
+metadata:
+ name: my-service
+spec:
+ selector:
+ app: MyApp
+ ports:
+ - protocol: SCTP
+ port: 80
+ targetPort: 9376
+ externalIPs:
+ - 80.11.12.10
+```
+
+#### NetworkPolicy with SCTP
+As a user of Kubernetes I want to define NetworkPolicies for my applications that use SCTP as L4 protocol on their interfaces, so the network plugins that support SCTP can control the accessibility of my applications on the SCTP based interfaces, too.
+
+Example:
+```
+apiVersion: networking.k8s.io/v1
+kind: NetworkPolicy
+metadata:
+ name: myservice-network-policy
+ namespace: myapp
+spec:
+ podSelector:
+ matchLabels:
+ role: myservice
+ policyTypes:
+ - Ingress
+ ingress:
+ - from:
+ - ipBlock:
+ cidr: 172.17.0.0/16
+ except:
+ - 172.17.1.0/24
+ - namespaceSelector:
+ matchLabels:
+ project: myproject
+ - podSelector:
+ matchLabels:
+ role: myclient
+ ports:
+ - protocol: SCTP
+ port: 7777
+```
+#### Userspace SCTP stack
+As a user of Kubernetes I want to deploy and run my applications that use a userspace SCTP stack, and at the same time I want to define SCTP Services in the same cluster. I use a userspace SCTP stack because of the limitations of the kernel's SCTP support. For example: it's not possible to write an SCTP server that proxies/filters arbitrary SCTP streams using the sockets APIs and kernel SCTP.
+
+### Implementation Details/Notes/Constraints [optional]
+
+#### SCTP in Services
+##### Kubernetes API modification
+The Kubernetes API modification for Services to support SCTP is obvious.
+
+##### Services with host level ports
+
+The kube-proxy and the kubelet starts listening on the defined TCP or UDP port in case of Servies with ClusterIP or NodePort or externalIP, and in case of containers with HostPort defined. The goal of this is to reserve the port in question so no other host level process can use that by accident. When it comes to SCTP the agreement is that we do not follow this pattern. That is, Kubernetes will not listen on host level ports with SCTP as protocol. The reason for this decision is, that the current TCP and UDP related implementation is not perfect either, it has known gaps in some use cases, and in those cases this listening is not started. But no one complained about those gaps so most probably this port reservation via listening logic is not needed at all.
+
+##### Services with type=LoadBalancer
+
+For Services with type=LoadBalancer we expect that the cloud provider's load balancer API client in Kubernetes rejects the requests with unsupported protocol.
+
+#### SCTP support in Kube DNS
+Kube DNS shall support SRV records with "_sctp" as "proto" value. According to our investigations, the DNS controller is very flexible from this perspective, and it can create SRV records with any protocol name. I.e. there is no need for additional implementation to achieve this goal.
+
+Example:
+
+```
+_diameter._sctp.my-service.default.svc.cluster.local. 30 IN SRV 10 100 1234 my-service.default.svc.cluster.local.
+```
+#### SCTP in the Pod's ContainerPort
+The Kubernetes API modification for the Pod is obvious.
+
+We support SCTP as protocol for any combinations of containerPort and hostPort.
+
+#### SCTP in NetworkPolicy
+The Kubernetes API modification for the NetworkPolicy is obvious.
+
+In order to utilize the new protocol value the network plugin must support it.
+
+#### Interworking with applications that use a user space SCTP stack
+
+##### Problem definition
+A userpace SCTP stack usually creates raw sockets with IPPROTO_SCTP. And as it is clearly highlighted in the [documentation of raw sockets][]:
+>Raw sockets may tap all IP protocols in Linux, even protocols like ICMP or TCP which have a protocol module in the kernel. In this case, the packets are passed to both the kernel module and the raw socket(s).
+
+I.e. if both the kernel module (lksctp) and a userspace SCTP stack are active on the same node both receive the incoming SCTP packets according to the current [kernel][] logic.
+
+But in turn the SCTP kernel module will handle those packets that are actually destined to the raw socket as Out of the blue (OOTB) packets according to the rules defined in [RFC4960][]. I.e. the SCTP kernel module sends SCTP ABORT to the sender, and on that way it aborts the connections of the userspace SCTP stack.
+
+As we can see, a userspace SCTP stack cannot co-exist with the SCTP kernel module (lksctp) on the same node. That is, the loading of the SCTP kernel module must be avoided on nodes where such applications that use userspace SCTP stack are planned to be run. The SCTP kernel module loading is triggered when an application starts managing SCTP sockets via the standard socket API or via syscalls.
+
+In order to resolve this problem the solution was to dedicate nodes to userspace SCTP applications in the past. Such applications that would trigger the loading of the SCTP kernel module were not deployed on those nodes.
+
+##### The solution in the Kubernetes SCTP support implementation
+Our main task here is to provide the same node level isolation possibility that was used in the past: i.e. to provide the option to dedicate some nodes to userspace SCTP applications, and ensure that the actions performed by Kubernetes (kubelet, kube-proxy) do not load the SCTP kernel modules on those dedicated nodes.
+
+On the Kubernetes side we solve this problem so, that we do not start listening on the SCTP ports defined for Servies with ClusterIP or NodePort or externalIP, neither in the case when containers with SCTP HostPort are defined. On this way we avoid the loading of the kernel module due to Kubernetes actions.
+
+On application side it is pretty easy to separate application pods that use a userspace SCTP stack from those application pods that use the kernel space SCTP stack: the usual nodeselector label based mechanism, or taints are there for this very purpose.
+
+NOTE! The handling of TCP and UDP Services does not change on those dedicated nodes.
+
+We propose the following solution:
+
+We describe in the Kubernetes documentation the mutually exclusive nature of userspace and kernel space SCTP stacks, and we would highlight, that the required separation of the userspace SCTP stack applications and the kernel module users shall be achieved with the usual nodeselector or taint based mechanisms.
+
+
+[documentation of raw sockets]: http://man7.org/linux/man-pages/man7/raw.7.html
+[kernel]: https://github.com/torvalds/linux/blob/0fbc4aeabc91f2e39e0dffebe8f81a0eb3648d97/net/ipv4/ip_input.c#L191
+
+### Risks and Mitigations
+
+## Graduation Criteria
+
+## Implementation History
+
+## Drawbacks [optional]
+
+## Alternatives [optional]
+