corrected, extended and improved the documentation of current resources regarding SIG activities.

author: Bernd Fondermann <bernd@brainlounge.de> 2018-10-18 10:16:26 +0200
committer: Bernd Fondermann <bernd@brainlounge.de> 2018-10-18 10:16:26 +0200
commit: dd9b87628aa1b3af385cea53e8800aaec6ac1c4b (patch)
tree: 12b49c9937b909aa8e51728f677ea041d90ba807
parent: 29d48a40def6dc081463bf3e5887509a0d180300 (diff)
1 files changed, 15 insertions, 38 deletions
diff --git a/sig-big-data/resources.md b/sig-big-data/resources.md
index cdac6324..a42b630e 100644
--- a/sig-big-data/resources.md
+++ b/sig-big-data/resources.md
@@ -8,20 +8,23 @@
 
 ##### Status
 
-Kubernetes is supported as a mainline Spark scheduler since [release 2.3][https://spark.apache.org/releases/spark-release-2-3-0.html], see [the detailed documentation][https://spark.apache.org/docs/latest/running-on-kubernetes.html].
-
-* [Spark on Kubernetes original Design Proposal](https://docs.google.com/document/d/1_bBzOZ8rKiOSjQg78DXOA3ZBIo_KkDJjqxVuq0yXdew/edit#)
-* [External Repository](https://github.com/apache-spark-on-k8s/spark)
+Kubernetes is supported as a mainline Spark scheduler since [release 2.3](https://spark.apache.org/releases/spark-release-2-3-0.html), see [the detailed documentation](https://spark.apache.org/docs/latest/running-on-kubernetes.html).
+That work was done after the [Spark on Kubernetes original Design Proposal](https://docs.google.com/document/d/1_bBzOZ8rKiOSjQg78DXOA3ZBIo_KkDJjqxVuq0yXdew/edit#)
+in the [apache-spark-on-k8s git repo](https://github.com/apache-spark-on-k8s/spark).
 
 ##### Activities 
 
+Enhancements are underway, with a good overview given [in this blog post](https://databricks.com/blog/2018/09/26/whats-new-for-apache-spark-on-kubernetes-in-the-upcoming-apache-spark-2-4-release.html).
+
 * Work is underway for Spark 2.4 to improve support and integration with HDFS.
-  * Design Document [How Spark on Kubernetes will access Secure HDFS][https://docs.google.com/document/d/1RBnXD9jMDjGonOdKJ2bA1lN4AAV_1RwpU_ewFuCNWKg/edit#heading=h.verdza2f4fyd]  
+  * Design Document: [How Spark on Kubernetes will access Secure HDFS](https://docs.google.com/document/d/1RBnXD9jMDjGonOdKJ2bA1lN4AAV_1RwpU_ewFuCNWKg/edit#heading=h.verdza2f4fyd)
 * shuffle service design
+  * Design Document [Improving Spark Shuffle Reliability](https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit)
+  * JIRA issue [SPARK-25299: Use remote storage for persisting shuffle data](https://issues.apache.org/jira/browse/SPARK-25299)
 
 ### HDFS
 
-[Apache Hadoop HDFS][https://hadoop.apache.org/hdfs] is a distributed file system, the persistence layer for Hadoop.
+[Apache Hadoop HDFS](https://hadoop.apache.org/hdfs) is a distributed file system, the persistence layer for Hadoop.
 
 ##### Status
 
@@ -34,11 +37,11 @@ TODO, e.g. "No release yet."
 
 ### Airflow
 
-[Apache Airflow][https://airflow.apache.org] is a platform to programmatically author, schedule and monitor workflows.
+[Apache Airflow](https://airflow.apache.org) is a platform to programmatically author, schedule and monitor workflows.
 
 ##### Status
 
-The [Kubernetes executor][https://airflow.apache.org/kubernetes.html]  has been introduced with Airflow [release 1.10.0][https://github.com/apache/incubator-airflow/blob/master/CHANGELOG.txt]  with support of Kubernetes 1.10. 
+The [Kubernetes executor](https://airflow.apache.org/kubernetes.html)  has been introduced with Airflow [release 1.10.0](https://github.com/apache/incubator-airflow/blob/master/CHANGELOG.txt)  with support of Kubernetes 1.10. 
 
 ##### Activities
 
@@ -46,39 +49,13 @@ The [Kubernetes executor][https://airflow.apache.org/kubernetes.html]  has been
 
 ### Flink
 
-[Apache Flink][https://flink.apache.org] is a distributed data processing framework.
-
-##### Status
-
-Flink 1.6 supports [running a session or job cluster on Kubernetes][https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html].
-
-##### Activities
-
-* [Native support for Kubernetes as a Flink runtime][https://issues.apache.org/jira/browse/FLINK-9953] 
-* [Lyft is working on an operator][https://lists.apache.org/thread.html/aa941030440c1d9e34c35c0caf5ddd2456755337fc34a4edebb32929@%3Cdev.flink.apache.org%3E] 
-
-### Kafka
-
-[Apache Kafka][https://kafka.apache.org/] is a distributed streaming platform.
+[Apache Flink](https://flink.apache.org) is a distributed data processing framework.
 
 ##### Status
 
-Confluent is working on an operator for Kafka.
-
-##### Activities   
-
-* [Confluent blog post][https://www.confluent.io/blog/getting-started-apache-kafka-kubernetes/] 
-* [Confluent operator landing page][https://www.confluent.io/confluent-operator/] 
-
-### Pulsar
-
-[Apache Pulsar][https://pulsar.apache.org] is an open-source distributed pub-sub messaging system.
-
-##### Status
-
-Pulsar supports [running on Kubernetes][https://pulsar.apache.org/docs/latest/deployment/Kubernetes/]. 
+Flink 1.6 supports [running a session or job cluster on Kubernetes](https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html).
 
 ##### Activities
 
-No current activities known.
-
+* [Native support for Kubernetes as a Flink runtime](https://issues.apache.org/jira/browse/FLINK-9953) 
+* [Lyft is working on an operator](https://lists.apache.org/thread.html/aa941030440c1d9e34c35c0caf5ddd2456755337fc34a4edebb32929@%3Cdev.flink.apache.org%3E)
author	Bernd Fondermann <bernd@brainlounge.de>	2018-10-18 10:16:26 +0200
committer	Bernd Fondermann <bernd@brainlounge.de>	2018-10-18 10:16:26 +0200
commit	dd9b87628aa1b3af385cea53e8800aaec6ac1c4b (patch)
tree	12b49c9937b909aa8e51728f677ea041d90ba807
parent	29d48a40def6dc081463bf3e5887509a0d180300 (diff)