From 39569087f2bcd774e5e3d8c83354add7a1885337 Mon Sep 17 00:00:00 2001 From: Brad Hoekstra Date: Wed, 17 Oct 2018 14:57:44 -0400 Subject: KEP: Make kube-proxy service abstraction optional --- keps/NEXT_KEP_NUMBER | 2 +- .../0031-20181017-kube-proxy-services-optional.md | 90 ++++++++++++++++++++++ 2 files changed, 91 insertions(+), 1 deletion(-) create mode 100644 keps/sig-network/0031-20181017-kube-proxy-services-optional.md diff --git a/keps/NEXT_KEP_NUMBER b/keps/NEXT_KEP_NUMBER index e85087af..f5c89552 100644 --- a/keps/NEXT_KEP_NUMBER +++ b/keps/NEXT_KEP_NUMBER @@ -1 +1 @@ -31 +32 diff --git a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md new file mode 100644 index 00000000..6e5e6e8b --- /dev/null +++ b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md @@ -0,0 +1,90 @@ +--- +kep-number: 0 +title: Make kube-proxy service abstraction optional +authors: + - "@bradhoekstra" +owning-sig: sig-network +participating-sigs: +reviewers: + - TBD +approvers: + - TBD +editor: "@bradhoekstra" +creation-date: 2018-10-17 +last-updated: 2018-10-17 +status: provisional +see-also: +replaces: +superseded-by: +--- + +# Make kube-proxy service abstraction optional + +## Table of Contents + +* [Table of Contents](#table-of-contents) +* [Summary](#summary) +* [Motivation](#motivation) + * [Goals](#goals) + * [Non-Goals](#non-goals) +* [Proposal](#proposal) + * [User Stories](#user-stories) + * [Story 1](#story-1) + * [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints) + * [Risks and Mitigations](#risks-and-mitigations) +* [Graduation Criteria](#graduation-criteria) + +## Summary + +In a cluster that has a service mesh a lot of the work being done by kube-proxy is redundant and wasted. +Specifically, services that are only reached via other services in the mesh will never use the service abstaction implemented by kube-proxy in iptables (or ipvs). +By informing the kube-proxy of this, we can lighten the work it is doing and the burden on its proxy backend. + +## Motivation + +The motivation for the enhancement is to allow higher scalability in large clusters with lots of services that are making use of a service mesh. + +### Goals + +The goal is to reduce the load on: +* The apiserver sending all services and endpoints to all kube-proxy pods +* The kube-proxy having to deserialize and process all services and endpoints +* The backend system (e.g. iptables) for whichever proxy mode kube-proxy is using + +### Non-Goals + +* Making sure the service is still routable via the service mesh + +## Proposal + +### User Stories + +#### Story 1 + +As a cluster operator, operating a cluster using a service mesh I want to be able to disable the kube-proxy service implementation for services in that mesh to reduce overall load on the whole cluster + +### Implementation Details/Notes/Constraints + +It is important for overall scalability that kube-proxy does not watch for service/endpoint changes that it is not going to affect. This can save a lot of load on the apiserver, networking, and kube-proxy itself by never requesting the updates in the first place. As such, annotating the services directly is considered insufficient as the kube-proxy would still have to watch for changed to the service. + +The proposal is to make this feature available at the namespace level: + +We will support a new label for namespaces: networking.k8s.io/kube-proxy=disabled + +kube-proxy will be modified to watch all namespaces and stop watching for services/endpoints in namespaces with the above label. + +The following cases should be tested. In each case, make sure that services are added/removed from iptables (or other) as expected: +* Adding/removing services from namespaces with and without the above label +* Adding/removing the above label from namespaces with existing services + +### Risks and Mitigations + +We will keep kube-proxy enabled by default, and only disable it when the cluster operator specifically asks to do so. + +## Graduation Criteria + +N/A + +## Implementation History + +- 2018-10-17 - This KEP is created -- cgit v1.2.3 From d0b4d300b278c5bf0210c8a70397a19af697618b Mon Sep 17 00:00:00 2001 From: Brad Hoekstra Date: Mon, 29 Oct 2018 15:58:59 -0400 Subject: Fill in more details --- .../0031-20181017-kube-proxy-services-optional.md | 55 ++++++++++++++++++++-- 1 file changed, 50 insertions(+), 5 deletions(-) diff --git a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md index 6e5e6e8b..a1fef246 100644 --- a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md +++ b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md @@ -31,13 +31,16 @@ superseded-by: * [User Stories](#user-stories) * [Story 1](#story-1) * [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints) + * [Design](#design) + * [Considerations](#considerations) + * [Testing](#testing) * [Risks and Mitigations](#risks-and-mitigations) * [Graduation Criteria](#graduation-criteria) ## Summary In a cluster that has a service mesh a lot of the work being done by kube-proxy is redundant and wasted. -Specifically, services that are only reached via other services in the mesh will never use the service abstaction implemented by kube-proxy in iptables (or ipvs). +Specifically, services that are only reached via other services in the mesh will never use the service abstraction implemented by kube-proxy in iptables (or ipvs). By informing the kube-proxy of this, we can lighten the work it is doing and the burden on its proxy backend. ## Motivation @@ -54,6 +57,7 @@ The goal is to reduce the load on: ### Non-Goals * Making sure the service is still routable via the service mesh +* Preserving any kube-proxy functionality for any intentionally disabled Service, including but not limited to: externalIPs, external LB routing, nodePorts, externalTrafficPolicy, healthCheckNodePort, UDP, SCTP ## Proposal @@ -65,21 +69,61 @@ As a cluster operator, operating a cluster using a service mesh I want to be abl ### Implementation Details/Notes/Constraints +#### Overview + It is important for overall scalability that kube-proxy does not watch for service/endpoint changes that it is not going to affect. This can save a lot of load on the apiserver, networking, and kube-proxy itself by never requesting the updates in the first place. As such, annotating the services directly is considered insufficient as the kube-proxy would still have to watch for changed to the service. -The proposal is to make this feature available at the namespace level: +The proposal is to make this feature available at the namespace level. We will support a new label for namespaces: `networking.k8s.io/service-proxy=disabled` + +When this label is set, kube-proxy will behave as if services in that namespace do not exist. None of the functionality that kube-proxy provides will be available for services in that namespace. + +It is expected that this feature will mainly be used on large clusters with lots (>1000) of services. Any use of this feature in a smaller cluster will have negligible impact. + +The envisioned cluster that will make use of this feature looks something like the following: +* Most/all traffic from outside the cluster is handled by gateways, such that each service in the cluster does not need a nodePort +* These small number of entry points into the cluster are a part of the service mesh +* There are many micro-services in the cluster, all a part of the service mesh, that are only accessed from inside the service mesh + * These services are in a separate namespace from the gateways + +#### Design + +Currently, when ProxyServer starts up it creates informers for all Service (ServiceConfig) and Endpoints (EndpointsConfig) objects using a single shared informer factory. The new design will make these previous objects be per-namespace, and only listen on namespaces that are not 'disabled'. + +The ProxyServer type will be updated with the following new methods: +* func (s *ProxyServer) StartWatchingNamespace(ns string) + * Check if namespace is currently watched, if it is then return + * Create a shared informer factory configured with the namespace + * Create a ServiceConfig and EndpointsConfig object using the shared informer factory +* func (s *ProxyServer) StopWatchingNamespace(ns string) + * Check if namespace is currently watched, if it is not then return + * Stop the ServiceConfig and EndpointsConfig for that namespace + * Send deletion events for all objects those configs knew about + * Delete the config objects + +At startup time, ProxyServer will create an informer for all Namespace objects. +* When a namespace objects is created or updated: + * Check for the above label, and if it is not set or is not 'disabled': + * StartWatchingNamespace() + * Else: + * StopWatchingNamespace() +* When a namespace object is deleted: + * StopWatchingNamespace() + +#### Considerations -We will support a new label for namespaces: networking.k8s.io/kube-proxy=disabled +kube-proxy has logic in it right now to not sync rules until the config objects have been synced. Care should be taken to make sure this logic still works, and that the data is only considered synced when the Namespace informer and all ServiceConfig and EndpointsConfig objects are synced. -kube-proxy will be modified to watch all namespaces and stop watching for services/endpoints in namespaces with the above label. +#### Testing The following cases should be tested. In each case, make sure that services are added/removed from iptables (or other) as expected: * Adding/removing services from namespaces with and without the above label * Adding/removing the above label from namespaces with existing services +* Deleting a namespace with services with and without the above label +* Having a label value other than 'disabled', which should behave as if the label is not set ### Risks and Mitigations -We will keep kube-proxy enabled by default, and only disable it when the cluster operator specifically asks to do so. +We will keep the existing behaviour enabled by default, and only disable it when the cluster operator specifically asks to do so. ## Graduation Criteria @@ -88,3 +132,4 @@ N/A ## Implementation History - 2018-10-17 - This KEP is created +- 2018-10-28 - KEP updated -- cgit v1.2.3 From f339ea09a91d856c13d9004f3961038a8f3448b8 Mon Sep 17 00:00:00 2001 From: Brad Hoekstra Date: Wed, 31 Oct 2018 15:35:27 -0400 Subject: Switch to labelling Services and Endpoints instead of Namespaces --- .../0031-20181017-kube-proxy-services-optional.md | 48 +++++++--------------- 1 file changed, 15 insertions(+), 33 deletions(-) diff --git a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md index a1fef246..d45e6bac 100644 --- a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md +++ b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md @@ -32,7 +32,6 @@ superseded-by: * [Story 1](#story-1) * [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints) * [Design](#design) - * [Considerations](#considerations) * [Testing](#testing) * [Risks and Mitigations](#risks-and-mitigations) * [Graduation Criteria](#graduation-criteria) @@ -71,11 +70,11 @@ As a cluster operator, operating a cluster using a service mesh I want to be abl #### Overview -It is important for overall scalability that kube-proxy does not watch for service/endpoint changes that it is not going to affect. This can save a lot of load on the apiserver, networking, and kube-proxy itself by never requesting the updates in the first place. As such, annotating the services directly is considered insufficient as the kube-proxy would still have to watch for changed to the service. +It is important for overall scalability that kube-proxy does not receive data for Service/Endpoints objects that it is not going to affect. This can reduce load on the apiserver, networking, and kube-proxy itself by never receiving the updates in the first place. -The proposal is to make this feature available at the namespace level. We will support a new label for namespaces: `networking.k8s.io/service-proxy=disabled` +The proposal is to make this feature available by annotating the Service object with this label: `kube-proxy.kubernetes.io/disabled=true`. The associated Endpoints object will automatically inherit that label from the Service object as well. -When this label is set, kube-proxy will behave as if services in that namespace do not exist. None of the functionality that kube-proxy provides will be available for services in that namespace. +When this label is set, kube-proxy will behave as if that service does not exist. None of the functionality that kube-proxy provides will be available for that service. It is expected that this feature will mainly be used on large clusters with lots (>1000) of services. Any use of this feature in a smaller cluster will have negligible impact. @@ -83,43 +82,26 @@ The envisioned cluster that will make use of this feature looks something like t * Most/all traffic from outside the cluster is handled by gateways, such that each service in the cluster does not need a nodePort * These small number of entry points into the cluster are a part of the service mesh * There are many micro-services in the cluster, all a part of the service mesh, that are only accessed from inside the service mesh - * These services are in a separate namespace from the gateways #### Design -Currently, when ProxyServer starts up it creates informers for all Service (ServiceConfig) and Endpoints (EndpointsConfig) objects using a single shared informer factory. The new design will make these previous objects be per-namespace, and only listen on namespaces that are not 'disabled'. +Currently, when ProxyServer starts up it creates informers for all Service (ServiceConfig) and Endpoints (EndpointsConfig) objects using a single shared informer factory. -The ProxyServer type will be updated with the following new methods: -* func (s *ProxyServer) StartWatchingNamespace(ns string) - * Check if namespace is currently watched, if it is then return - * Create a shared informer factory configured with the namespace - * Create a ServiceConfig and EndpointsConfig object using the shared informer factory -* func (s *ProxyServer) StopWatchingNamespace(ns string) - * Check if namespace is currently watched, if it is not then return - * Stop the ServiceConfig and EndpointsConfig for that namespace - * Send deletion events for all objects those configs knew about - * Delete the config objects - -At startup time, ProxyServer will create an informer for all Namespace objects. -* When a namespace objects is created or updated: - * Check for the above label, and if it is not set or is not 'disabled': - * StartWatchingNamespace() - * Else: - * StopWatchingNamespace() -* When a namespace object is deleted: - * StopWatchingNamespace() - -#### Considerations - -kube-proxy has logic in it right now to not sync rules until the config objects have been synced. Care should be taken to make sure this logic still works, and that the data is only considered synced when the Namespace informer and all ServiceConfig and EndpointsConfig objects are synced. +The new design will simply add a LabelSelector filter to the shared informer factory, such that objects with the above label are filtered out by the API server: +```diff +- informerFactory := informers.NewSharedInformerFactory(s.Client, s.ConfigSyncPeriod) ++ informerFactory := informers.NewSharedInformerFactoryWithOptions(s.Client, s.ConfigSyncPeriod, ++ informers.WithTweakListOptions(func(options *v1meta.ListOptions) { ++ options.LabelSelector = "kube-proxy.kubernetes.io/disabled!=true" ++ })) +``` #### Testing The following cases should be tested. In each case, make sure that services are added/removed from iptables (or other) as expected: -* Adding/removing services from namespaces with and without the above label -* Adding/removing the above label from namespaces with existing services -* Deleting a namespace with services with and without the above label -* Having a label value other than 'disabled', which should behave as if the label is not set +* Adding/removing services/endpoints with and without the above label +* Adding/removing the above label from existing services/endpoints +* Having a label value other than 'true', which should behave as if the label is not set ### Risks and Mitigations -- cgit v1.2.3 From f3d8aedcf44ba4a425b32744cea7df4e57e7dbf4 Mon Sep 17 00:00:00 2001 From: Brad Hoekstra Date: Mon, 5 Nov 2018 14:27:39 -0500 Subject: Remove apiserver goal, be explicit about handling dynamic label changes --- keps/sig-network/0031-20181017-kube-proxy-services-optional.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md index d45e6bac..ce297523 100644 --- a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md +++ b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md @@ -49,7 +49,6 @@ The motivation for the enhancement is to allow higher scalability in large clust ### Goals The goal is to reduce the load on: -* The apiserver sending all services and endpoints to all kube-proxy pods * The kube-proxy having to deserialize and process all services and endpoints * The backend system (e.g. iptables) for whichever proxy mode kube-proxy is using @@ -70,12 +69,14 @@ As a cluster operator, operating a cluster using a service mesh I want to be abl #### Overview -It is important for overall scalability that kube-proxy does not receive data for Service/Endpoints objects that it is not going to affect. This can reduce load on the apiserver, networking, and kube-proxy itself by never receiving the updates in the first place. +It is important for overall scalability that kube-proxy does not receive data for Service/Endpoints objects that it is not going to affect. This can reduce load on the kube-proxy and the network by never receiving the updates in the first place. The proposal is to make this feature available by annotating the Service object with this label: `kube-proxy.kubernetes.io/disabled=true`. The associated Endpoints object will automatically inherit that label from the Service object as well. When this label is set, kube-proxy will behave as if that service does not exist. None of the functionality that kube-proxy provides will be available for that service. +kube-proxy will properly implement this label both as object creation and on dynamic addition/removal/updates of this label, either providing functionality or not for the service based on the latest version on the object. + It is expected that this feature will mainly be used on large clusters with lots (>1000) of services. Any use of this feature in a smaller cluster will have negligible impact. The envisioned cluster that will make use of this feature looks something like the following: @@ -96,6 +97,8 @@ The new design will simply add a LabelSelector filter to the shared informer fac + })) ``` +This code will also handle the dynamic label update case. When the label selector is matched (service is enabled) an 'add' event will be generated by the informer. When the label selector is not matched (service is disabled) a 'delete' event will be generated by the informer. + #### Testing The following cases should be tested. In each case, make sure that services are added/removed from iptables (or other) as expected: -- cgit v1.2.3 From 916672b0a4ec5f144db239b804eb9bfe8ce28f87 Mon Sep 17 00:00:00 2001 From: Brad Hoekstra Date: Mon, 5 Nov 2018 14:29:50 -0500 Subject: Spelling --- keps/sig-network/0031-20181017-kube-proxy-services-optional.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md index ce297523..646aeaa6 100644 --- a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md +++ b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md @@ -75,7 +75,7 @@ The proposal is to make this feature available by annotating the Service object When this label is set, kube-proxy will behave as if that service does not exist. None of the functionality that kube-proxy provides will be available for that service. -kube-proxy will properly implement this label both as object creation and on dynamic addition/removal/updates of this label, either providing functionality or not for the service based on the latest version on the object. +kube-proxy will properly implement this label both at object creation and on dynamic addition/removal/updates of this label, either providing functionality or not for the service based on the latest version on the object. It is expected that this feature will mainly be used on large clusters with lots (>1000) of services. Any use of this feature in a smaller cluster will have negligible impact. -- cgit v1.2.3 From 68029a440d1f48f257a948ad7cbe3d46bc14bfc3 Mon Sep 17 00:00:00 2001 From: Brad Hoekstra Date: Tue, 6 Nov 2018 10:12:16 -0500 Subject: Add more detail to the Overview section --- keps/sig-network/0031-20181017-kube-proxy-services-optional.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md index 646aeaa6..e7112ec0 100644 --- a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md +++ b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md @@ -69,6 +69,8 @@ As a cluster operator, operating a cluster using a service mesh I want to be abl #### Overview +In a cluster where a service is only accessed via other applications in the service mesh the work that kube-proxy does to program the proxy (e.g. iptables) for that service is duplicated and unused. The service mesh itself handles load balancing for the service VIP. This case is often true in the standard service mesh setup of utilizing ingress/egress gateways, such that services are not directly exposed outside the cluster. In this setup, application services rarely make use of other Service features such as externalIPs, external LB routing, nodePorts, externalTrafficPolicy, healthCheckNodePort, UDP, SCTP. We can optimize this cluster by giving kube-proxy a way to not have to perform the duplicate work for these services. + It is important for overall scalability that kube-proxy does not receive data for Service/Endpoints objects that it is not going to affect. This can reduce load on the kube-proxy and the network by never receiving the updates in the first place. The proposal is to make this feature available by annotating the Service object with this label: `kube-proxy.kubernetes.io/disabled=true`. The associated Endpoints object will automatically inherit that label from the Service object as well. -- cgit v1.2.3 From 74798ecb09d059c1aae64dfdb55f5395f7b2024f Mon Sep 17 00:00:00 2001 From: Brad Hoekstra Date: Tue, 6 Nov 2018 10:20:37 -0500 Subject: Add comment about frameworks that could use this feature --- keps/sig-network/0031-20181017-kube-proxy-services-optional.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md index e7112ec0..ec4eb942 100644 --- a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md +++ b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md @@ -86,6 +86,8 @@ The envisioned cluster that will make use of this feature looks something like t * These small number of entry points into the cluster are a part of the service mesh * There are many micro-services in the cluster, all a part of the service mesh, that are only accessed from inside the service mesh +Higher level frameworks built on top of service meshes, such as [Knative](https://github.com/knative/docs), will be able to enable this feature by default due to having a more controlled application/service model and being reliant on the service mesh. + #### Design Currently, when ProxyServer starts up it creates informers for all Service (ServiceConfig) and Endpoints (EndpointsConfig) objects using a single shared informer factory. -- cgit v1.2.3 From e8a7362d561cee83210a2586042491126fa4cefb Mon Sep 17 00:00:00 2001 From: Brad Hoekstra Date: Mon, 12 Nov 2018 11:08:29 -0500 Subject: Change feature label, add approver and reviewer --- .../0031-20181017-kube-proxy-services-optional.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md index ec4eb942..f36c32ff 100644 --- a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md +++ b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md @@ -6,12 +6,12 @@ authors: owning-sig: sig-network participating-sigs: reviewers: - - TBD + - "@freehan" approvers: - - TBD + - "@thockin" editor: "@bradhoekstra" creation-date: 2018-10-17 -last-updated: 2018-10-17 +last-updated: 2018-11-12 status: provisional see-also: replaces: @@ -73,7 +73,7 @@ In a cluster where a service is only accessed via other applications in the serv It is important for overall scalability that kube-proxy does not receive data for Service/Endpoints objects that it is not going to affect. This can reduce load on the kube-proxy and the network by never receiving the updates in the first place. -The proposal is to make this feature available by annotating the Service object with this label: `kube-proxy.kubernetes.io/disabled=true`. The associated Endpoints object will automatically inherit that label from the Service object as well. +The proposal is to make this feature available by annotating the Service object with this label: `service.kubernetes.io/alternative-service-proxy`. If this label key is set, with any value, the associated Endpoints object will automatically inherit that label from the Service object as well. When this label is set, kube-proxy will behave as if that service does not exist. None of the functionality that kube-proxy provides will be available for that service. @@ -97,7 +97,7 @@ The new design will simply add a LabelSelector filter to the shared informer fac - informerFactory := informers.NewSharedInformerFactory(s.Client, s.ConfigSyncPeriod) + informerFactory := informers.NewSharedInformerFactoryWithOptions(s.Client, s.ConfigSyncPeriod, + informers.WithTweakListOptions(func(options *v1meta.ListOptions) { -+ options.LabelSelector = "kube-proxy.kubernetes.io/disabled!=true" ++ options.LabelSelector = "!service.kubernetes.io/alternative-service-proxy" + })) ``` @@ -121,4 +121,4 @@ N/A ## Implementation History - 2018-10-17 - This KEP is created -- 2018-10-28 - KEP updated +- 2018-11-12 - KEP updated, including approver/reviewer -- cgit v1.2.3 From 8a3be45708c2b06c628e12ba2717327ebf31dac0 Mon Sep 17 00:00:00 2001 From: Brad Hoekstra Date: Tue, 20 Nov 2018 16:36:47 -0500 Subject: Change label key, add comments about alternate service proxy implementations. --- .../0031-20181017-kube-proxy-services-optional.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md index f36c32ff..9e77d4a2 100644 --- a/keps/sig-network/0031-20181017-kube-proxy-services-optional.md +++ b/keps/sig-network/0031-20181017-kube-proxy-services-optional.md @@ -1,5 +1,5 @@ --- -kep-number: 0 +kep-number: 31 title: Make kube-proxy service abstraction optional authors: - "@bradhoekstra" @@ -73,12 +73,14 @@ In a cluster where a service is only accessed via other applications in the serv It is important for overall scalability that kube-proxy does not receive data for Service/Endpoints objects that it is not going to affect. This can reduce load on the kube-proxy and the network by never receiving the updates in the first place. -The proposal is to make this feature available by annotating the Service object with this label: `service.kubernetes.io/alternative-service-proxy`. If this label key is set, with any value, the associated Endpoints object will automatically inherit that label from the Service object as well. +The proposal is to make this feature available by annotating the Service object with this label: `service.kubernetes.io/service-proxy-name`. If this label key is set, with any value, the associated Endpoints object will automatically inherit that label from the Service object as well. When this label is set, kube-proxy will behave as if that service does not exist. None of the functionality that kube-proxy provides will be available for that service. kube-proxy will properly implement this label both at object creation and on dynamic addition/removal/updates of this label, either providing functionality or not for the service based on the latest version on the object. +It is optional for other service proxy implementations (besides kube-proxy) to implement this feature. They may ignore this value and still remain conformant with kubernetes services. + It is expected that this feature will mainly be used on large clusters with lots (>1000) of services. Any use of this feature in a smaller cluster will have negligible impact. The envisioned cluster that will make use of this feature looks something like the following: @@ -97,7 +99,7 @@ The new design will simply add a LabelSelector filter to the shared informer fac - informerFactory := informers.NewSharedInformerFactory(s.Client, s.ConfigSyncPeriod) + informerFactory := informers.NewSharedInformerFactoryWithOptions(s.Client, s.ConfigSyncPeriod, + informers.WithTweakListOptions(func(options *v1meta.ListOptions) { -+ options.LabelSelector = "!service.kubernetes.io/alternative-service-proxy" ++ options.LabelSelector = "!service.kubernetes.io/service-proxy-name" + })) ``` @@ -108,11 +110,12 @@ This code will also handle the dynamic label update case. When the label selecto The following cases should be tested. In each case, make sure that services are added/removed from iptables (or other) as expected: * Adding/removing services/endpoints with and without the above label * Adding/removing the above label from existing services/endpoints -* Having a label value other than 'true', which should behave as if the label is not set ### Risks and Mitigations -We will keep the existing behaviour enabled by default, and only disable it when the cluster operator specifically asks to do so. +We will keep the existing behaviour enabled by default, and only disable the kube-proxy service proxy when the service contains this new label. + +This will have no effect on alternate service proxy implementations since they will not handle this label. ## Graduation Criteria -- cgit v1.2.3