summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMarc Tamsky <tamsky@users.noreply.github.com>2015-04-30 22:58:33 -0700
committerMarc Tamsky <tamsky@users.noreply.github.com>2015-04-30 22:58:33 -0700
commit915f099020547e911ee8b7badfc4e130d5cf8da3 (patch)
tree2c8212e18e9140f4db7d2f079757ce7a2d5f56cc
parent35bb6a1e9891657621ab3ae359e1b2518de98356 (diff)
React to failure by growing the remaining clusters
-rw-r--r--federation.md6
1 files changed, 5 insertions, 1 deletions
diff --git a/federation.md b/federation.md
index df9f37eb..e261833e 100644
--- a/federation.md
+++ b/federation.md
@@ -222,10 +222,14 @@ initial implementation targeting single cloud provider only.
1. Auto-scaling (not yet available) in the remaining clusters takes
care of it for me automagically as the additional failed-over
traffic arrives (with some latency).
+1. I manually specify "additional resources to be provisioned" per
+ remaining cluster, possibly proportional to both the remaining functioning resources
+ and the unavailable resources in the failed cluster(s).
+ (All the benefits of over-provisioning, without expensive idle resources.)
Doing nothing (i.e. forcing users to choose between 1 and 2 on their
own) is probably an OK starting point. Kubernetes autoscaling can get
-us to three at some later date.
+us to 3 at some later date.
Up to this point, this use case ("Unavailability Zones") seems materially different from all the others above. It does not require dynamic cross-cluster service migration (we assume that the service is already running in more than one cluster when the failure occurs). Nor does it necessarily involve cross-cluster service discovery or location affinity. As a result, I propose that we address this use case somewhat independently of the others (although I strongly suspect that it will become substantially easier once we've solved the others).