{"id":306,"date":"2018-10-16T08:36:08","date_gmt":"2018-10-16T08:36:08","guid":{"rendered":"https:\/\/www.appservgrid.com\/paw93\/?p=306"},"modified":"2018-10-16T20:50:08","modified_gmt":"2018-10-16T20:50:08","slug":"kubedirector-the-easy-way-to-run-complex-stateful-applications-on-kubernetes","status":"publish","type":"post","link":"https:\/\/www.appservgrid.com\/paw93\/index.php\/2018\/10\/16\/kubedirector-the-easy-way-to-run-complex-stateful-applications-on-kubernetes\/","title":{"rendered":"KubeDirector: The easy way to run complex stateful applications on Kubernetes"},"content":{"rendered":"<p>&nbsp;<\/p>\n<h3><a href=\"https:\/\/kubernetes.io\/blog\/2018\/10\/03\/kubedirector-the-easy-way-to-run-complex-stateful-applications-on-kubernetes\/\">KubeDirector: The easy way to run complex stateful applications on Kubernetes<\/a><\/h3>\n<p>Author: Thomas Phelan (BlueData)<\/p>\n<p>KubeDirector is an open source project designed to make it easy to run complex stateful scale-out application clusters on Kubernetes. KubeDirector is built using the custom resource definition (CRD) framework and leverages the native Kubernetes API extensions and design philosophy. This enables transparent integration with Kubernetes user\/resource management as well as existing clients and tools.<\/p>\n<p>We recently <a href=\"https:\/\/medium.com\/@thomas_phelan\/operation-stateful-introducing-bluek8s-and-kubernetes-director-aa204952f619\/\" target=\"_blank\" rel=\"noopener\">introduced the KubeDirector project<\/a>, as part of a broader open source Kubernetes initiative we call BlueK8s. I\u2019m happy to announce that the pre-alpha<br \/>\ncode for <a href=\"https:\/\/github.com\/bluek8s\/kubedirector\/\" target=\"_blank\" rel=\"noopener\">KubeDirector<\/a> is now available. And in this blog post, I\u2019ll show how it works.<\/p>\n<p>KubeDirector provides the following capabilities:<\/p>\n<ul>\n<li>The ability to run non-cloud native stateful applications on Kubernetes without modifying the code. In other words, it\u2019s not necessary to decompose these existing applications to fit a microservices design pattern.<\/li>\n<li>Native support for preserving application-specific configuration and state.<\/li>\n<li>An application-agnostic deployment pattern, minimizing the time to onboard new stateful applications to Kubernetes.<\/li>\n<\/ul>\n<p>KubeDirector enables data scientists familiar with data-intensive distributed applications such as Hadoop, Spark, Cassandra, TensorFlow, Caffe2, etc. to run these applications on Kubernetes \u2013 with a minimal learning curve and no need to write GO code. The applications controlled by KubeDirector are defined by some basic metadata and an associated package of configuration artifacts. The application metadata is referred to as a KubeDirectorApp resource.<\/p>\n<p>To understand the components of KubeDirector, clone the repository on <a href=\"https:\/\/github.com\/bluek8s\/kubedirector\/\" target=\"_blank\" rel=\"noopener\">GitHub<\/a> using a command similar to:<\/p>\n<p>git clone http:\/\/&lt;userid&gt;@github.com\/bluek8s\/kubedirector.<\/p>\n<p>The KubeDirectorApp definition for the Spark 2.2.1 application is located<br \/>\nin the file kubedirector\/deploy\/example_catalog\/cr-app-spark221e2.json.<\/p>\n<p>~&gt; cat kubedirector\/deploy\/example_catalog\/cr-app-spark221e2.json<br \/>\n{<br \/>\n&#8220;apiVersion&#8221;: &#8220;kubedirector.bluedata.io\/v1alpha1&#8221;,<br \/>\n&#8220;kind&#8221;: &#8220;KubeDirectorApp&#8221;,<br \/>\n&#8220;metadata&#8221;: {<br \/>\n&#8220;name&#8221; : &#8220;spark221e2&#8221;<br \/>\n},<br \/>\n&#8220;spec&#8221; : {<br \/>\n&#8220;systemctlMounts&#8221;: true,<br \/>\n&#8220;config&#8221;: {<br \/>\n&#8220;node_services&#8221;: [<br \/>\n{<br \/>\n&#8220;service_ids&#8221;: [<br \/>\n&#8220;ssh&#8221;,<br \/>\n&#8220;spark&#8221;,<br \/>\n&#8220;spark_master&#8221;,<br \/>\n&#8220;spark_worker&#8221;<br \/>\n],<br \/>\n\u2026<\/p>\n<p>The configuration of an application cluster is referred to as a KubeDirectorCluster resource. The<br \/>\nKubeDirectorCluster definition for a sample Spark 2.2.1 cluster is located in the file<br \/>\nkubedirector\/deploy\/example_clusters\/cr-cluster-spark221.e1.yaml.<\/p>\n<p>~&gt; cat kubedirector\/deploy\/example_clusters\/cr-cluster-spark221.e1.yaml<br \/>\napiVersion: &#8220;kubedirector.bluedata.io\/v1alpha1&#8221;<br \/>\nkind: &#8220;KubeDirectorCluster&#8221;<br \/>\nmetadata:<br \/>\nname: &#8220;spark221e2&#8221;<br \/>\nspec:<br \/>\napp: spark221e2<br \/>\nroles:<br \/>\n&#8211; name: controller<br \/>\nreplicas: 1<br \/>\nresources:<br \/>\nrequests:<br \/>\nmemory: &#8220;4Gi&#8221;<br \/>\ncpu: &#8220;2&#8221;<br \/>\nlimits:<br \/>\nmemory: &#8220;4Gi&#8221;<br \/>\ncpu: &#8220;2&#8221;<br \/>\n&#8211; name: worker<br \/>\nreplicas: 2<br \/>\nresources:<br \/>\nrequests:<br \/>\nmemory: &#8220;4Gi&#8221;<br \/>\ncpu: &#8220;2&#8221;<br \/>\nlimits:<br \/>\nmemory: &#8220;4Gi&#8221;<br \/>\ncpu: &#8220;2&#8221;<br \/>\n&#8211; name: jupyter<br \/>\n\u2026<\/p>\n<h2>Running Spark on Kubernetes with KubeDirector<\/h2>\n<p>With KubeDirector, it\u2019s easy to run Spark clusters on Kubernetes.<\/p>\n<p>First, verify that Kubernetes (version 1.9 or later) is running, using the command kubectl version<\/p>\n<p>~&gt; kubectl version<br \/>\nClient Version: version.Info<br \/>\nServer Version: version.Info<\/p>\n<p>Deploy the KubeDirector service and the example KubeDirectorApp resource definitions with the commands:<\/p>\n<p>cd kubedirector<br \/>\nmake deploy<\/p>\n<p>These will start the KubeDirector pod:<\/p>\n<p>~&gt; kubectl get pods<br \/>\nNAME READY STATUS RESTARTS AGE<br \/>\nkubedirector-58cf59869-qd9hb 1\/1 Running 0 1m<\/p>\n<p>List the installed KubeDirector applications with kubectl get KubeDirectorApp<\/p>\n<p>~&gt; kubectl get KubeDirectorApp<br \/>\nNAME AGE<br \/>\ncassandra311 30m<br \/>\nspark211up 30m<br \/>\nspark221e2 30m<\/p>\n<p>Now you can launch a Spark 2.2.1 cluster using the example KubeDirectorCluster file and the<br \/>\nkubectl create -f deploy\/example_clusters\/cr-cluster-spark211up.yaml command.<br \/>\nVerify that the Spark cluster has been started:<\/p>\n<p>~&gt; kubectl get pods<br \/>\nNAME READY STATUS RESTARTS AGE<br \/>\nkubedirector-58cf59869-djdwl 1\/1 Running 0 19m<br \/>\nspark221e2-controller-zbg4d-0 1\/1 Running 0 23m<br \/>\nspark221e2-jupyter-2km7q-0 1\/1 Running 0 23m<br \/>\nspark221e2-worker-4gzbz-0 1\/1 Running 0 23m<br \/>\nspark221e2-worker-4gzbz-1 1\/1 Running 0 23m<\/p>\n<p>The running services now include the Spark services:<\/p>\n<p>~&gt; kubectl get service<br \/>\nNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE<br \/>\nkubedirector ClusterIP 10.98.234.194 &lt;none&gt; 60000\/TCP 1d<br \/>\nkubernetes ClusterIP 10.96.0.1 &lt;none&gt; 443\/TCP 1d<br \/>\nsvc-spark221e2-5tg48 ClusterIP None &lt;none&gt; 8888\/TCP 21s<br \/>\nsvc-spark221e2-controller-tq8d6-0 NodePort 10.104.181.123 &lt;none&gt; 22:30534\/TCP,8080:31533\/TCP,7077:32506\/TCP,8081:32099\/TCP 20s<br \/>\nsvc-spark221e2-jupyter-6989v-0 NodePort 10.105.227.249 &lt;none&gt; 22:30632\/TCP,8888:30355\/TCP 20s<br \/>\nsvc-spark221e2-worker-d9892-0 NodePort 10.107.131.165 &lt;none&gt; 22:30358\/TCP,8081:32144\/TCP 20s<br \/>\nsvc-spark221e2-worker-d9892-1 NodePort 10.110.88.221 &lt;none&gt; 22:30294\/TCP,8081:31436\/TCP 20s<\/p>\n<p>Pointing the browser at port 31533 connects to the Spark Master UI:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/d33wubrfki0l68.cloudfront.net\/5410ad39a3205e8470dac3bd0f36aa4c704713f9\/af429\/images\/blog\/2018-10-03-kubedirector\/kubedirector.png\" alt=\"kubedirector\" \/><\/p>\n<p>That\u2019s all there is to it!<br \/>\nIn fact, in the example above we also deployed a Jupyter notebook along with the Spark cluster.<\/p>\n<p>To start another application (e.g. Cassandra), just specify another KubeDirectorApp file:<\/p>\n<p>kubectl create -f deploy\/example_clusters\/cr-cluster-cassandra311.yaml<\/p>\n<p>See the running Cassandra cluster:<\/p>\n<p>~&gt; kubectl get pods<br \/>\nNAME READY STATUS RESTARTS AGE<br \/>\ncassandra311-seed-v24r6-0 1\/1 Running 0 1m<br \/>\ncassandra311-seed-v24r6-1 1\/1 Running 0 1m<br \/>\ncassandra311-worker-rqrhl-0 1\/1 Running 0 1m<br \/>\ncassandra311-worker-rqrhl-1 1\/1 Running 0 1m<br \/>\nkubedirector-58cf59869-djdwl 1\/1 Running 0 1d<br \/>\nspark221e2-controller-tq8d6-0 1\/1 Running 0 22m<br \/>\nspark221e2-jupyter-6989v-0 1\/1 Running 0 22m<br \/>\nspark221e2-worker-d9892-0 1\/1 Running 0 22m<br \/>\nspark221e2-worker-d9892-1 1\/1 Running 0 22m<\/p>\n<p>Now you have a Spark cluster (with a Jupyter notebook) and a Cassandra cluster running on Kubernetes.<br \/>\nUse kubectl get service to see the set of services.<\/p>\n<p>~&gt; kubectl get service<br \/>\nNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE<br \/>\nkubedirector ClusterIP 10.98.234.194 &lt;none&gt; 60000\/TCP 1d<br \/>\nkubernetes ClusterIP 10.96.0.1 &lt;none&gt; 443\/TCP 1d<br \/>\nsvc-cassandra311-seed-v24r6-0 NodePort 10.96.94.204 &lt;none&gt; 22:31131\/TCP,9042:30739\/TCP 3m<br \/>\nsvc-cassandra311-seed-v24r6-1 NodePort 10.106.144.52 &lt;none&gt; 22:30373\/TCP,9042:32662\/TCP 3m<br \/>\nsvc-cassandra311-vhh29 ClusterIP None &lt;none&gt; 8888\/TCP 3m<br \/>\nsvc-cassandra311-worker-rqrhl-0 NodePort 10.109.61.194 &lt;none&gt; 22:31832\/TCP,9042:31962\/TCP 3m<br \/>\nsvc-cassandra311-worker-rqrhl-1 NodePort 10.97.147.131 &lt;none&gt; 22:31454\/TCP,9042:31170\/TCP 3m<br \/>\nsvc-spark221e2-5tg48 ClusterIP None &lt;none&gt; 8888\/TCP 24m<br \/>\nsvc-spark221e2-controller-tq8d6-0 NodePort 10.104.181.123 &lt;none&gt; 22:30534\/TCP,8080:31533\/TCP,7077:32506\/TCP,8081:32099\/TCP 24m<br \/>\nsvc-spark221e2-jupyter-6989v-0 NodePort 10.105.227.249 &lt;none&gt; 22:30632\/TCP,8888:30355\/TCP 24m<br \/>\nsvc-spark221e2-worker-d9892-0 NodePort 10.107.131.165 &lt;none&gt; 22:30358\/TCP,8081:32144\/TCP 24m<br \/>\nsvc-spark221e2-worker-d9892-1 NodePort 10.110.88.221 &lt;none&gt; 22:30294\/TCP,8081:31436\/TCP 24m<\/p>\n<h2>Get Involved<\/h2>\n<p>KubeDirector is a fully open source, Apache v2 licensed, project \u2013 the first of multiple open source projects within a broader initiative we call BlueK8s.<br \/>\nThe pre-alpha code for KubeDirector has just been released and we would love for you to join the growing community of developers, contributors, and adopters.<br \/>\nFollow <a href=\"https:\/\/twitter.com\/BlueK8s\/\" target=\"_blank\" rel=\"noopener\">@BlueK8s<\/a> on Twitter and get involved through these channels:<\/p>\n<ul>\n<li>KubeDirector <a href=\"https:\/\/join.slack.com\/t\/bluek8s\/shared_invite\/enQtNDUwMzkwODY5OTM4LTRhYmRmZmE4YzY3OGUzMjA1NDg0MDVhNDQ2MGNkYjRhM2RlMDNjMTI1NDQyMjAzZGVlMDFkNThkNGFjZGZjMGY\/\" target=\"_blank\" rel=\"noopener\">chat room on Slack<\/a><\/li>\n<li>KubeDirector <a href=\"https:\/\/github.com\/bluek8s\/kubedirector\/\" target=\"_blank\" rel=\"noopener\">GitHub repo<\/a><\/li>\n<\/ul>\n<p><a href=\"https:\/\/kubernetes.io\/blog\/2018\/10\/03\/kubedirector-the-easy-way-to-run-complex-stateful-applications-on-kubernetes\/\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; KubeDirector: The easy way to run complex stateful applications on Kubernetes Author: Thomas Phelan (BlueData) KubeDirector is an open source project designed to make it easy to run complex stateful scale-out application clusters on Kubernetes. KubeDirector is built using the custom resource definition (CRD) framework and leverages the native Kubernetes API extensions and design &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.appservgrid.com\/paw93\/index.php\/2018\/10\/16\/kubedirector-the-easy-way-to-run-complex-stateful-applications-on-kubernetes\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;KubeDirector: The easy way to run complex stateful applications on Kubernetes&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-306","post","type-post","status-publish","format-standard","hentry","category-kubernetes"],"_links":{"self":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/306","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/comments?post=306"}],"version-history":[{"count":1,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/306\/revisions"}],"predecessor-version":[{"id":412,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/306\/revisions\/412"}],"wp:attachment":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/media?parent=306"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/categories?post=306"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/tags?post=306"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}