{"id":1403,"date":"2019-03-05T14:35:40","date_gmt":"2019-03-05T14:35:40","guid":{"rendered":"https:\/\/www.appservgrid.com\/paw93\/?p=1403"},"modified":"2019-03-07T21:00:32","modified_gmt":"2019-03-07T21:00:32","slug":"deploying-elasticsearch-within-kubernetes","status":"publish","type":"post","link":"https:\/\/www.appservgrid.com\/paw93\/index.php\/2019\/03\/05\/deploying-elasticsearch-within-kubernetes\/","title":{"rendered":"Deploying Elasticsearch Within Kubernetes"},"content":{"rendered":"<h2>Introduction<\/h2>\n<p>Elasticsearch is an open-source search engine based on <a href=\"http:\/\/lucene.apache.org\">Apache Lucene<\/a> and developed by <a href=\"https:\/\/elastic.co\">Elastic<\/a>. It focuses on features like scalability, resilience, and performance, and companies all around the world, including Mozilla, Facebook, Github, Netflix, eBay, the New York Times, and others, use it every day. Elasticsearch is one of the most popular analytics platforms for large datasets and is present almost everywhere that you find a search engine. It uses a document-oriented approach when manipulating data, and it can parse it in almost real-time while a user is performing a search. It stores data in JSON and organizes data by index and type.<\/p>\n<p>If we draw analogs between the components of a traditional relational database and those of Elasticsearch, they look like this:<\/p>\n<ul>\n<li>Database or Table -&gt; Index<\/li>\n<li>Row\/Column -&gt; Document with properties<\/li>\n<\/ul>\n<h3>Elasticsearch Advantages<\/h3>\n<ul>\n<li>It originates from Apache Lucene, which provides the most robust full-text search capabilities of any open source product.<\/li>\n<li>It uses a document-oriented architecture to store complex real-world entities as structured JSON documents. By default, it indexes all fields, which provides tremendous performance when searching.<\/li>\n<li>It doesn\u2019t use a schema with its indices. Documents add new fields by including them, which gives the freedom to add, remove, or change relevant fields without the downtime associated with a traditional database schema upgrade.<\/li>\n<li>It performs linguistic searches against documents, returning those that match the search condition. It scores the results using the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Tf\u2013idf\">TFIDF algorithm<\/a>, bringing more relevant documents higher up in the list of results.<\/li>\n<li>It allows <em>fuzzy searching<\/em>, which helps find results even with misspelled search terms.<\/li>\n<li>It supports real-time search autocompletion, returning results while the user types their search query.<\/li>\n<li>It uses a RESTful API, exposing its power via a simple, lightweight interface.<\/li>\n<li>Elasticsearch executes complex queries with tremendous speed. It also caches queries, returning cached results for other requests that match a cached filter.<\/li>\n<li>It scales horizontally, making it possible to extend resources and balance the load between cluster nodes.<\/li>\n<li>It breaks indices into shards, and each shard has any number of replicas. Each node knows the location of every document in the cluster and routes requests internally as necessary to retrieve the data.<\/li>\n<\/ul>\n<h3>Terminology<\/h3>\n<p>Elasticsearch uses specific terms to define its components.<\/p>\n<ul>\n<li>Cluster: A collection of nodes that work together.<\/li>\n<li>Node: A single server that acts as part of the cluster, stores the data, and participates in the cluster\u2019s indexing and search capabilities.<\/li>\n<li>Index: A collection of documents with similar characteristics.<\/li>\n<li>Document: The basic unit of information that can be indexed.<\/li>\n<li>Shards: Indexes are divided into multiple pieces called shards, which allows the index to scale horizontally.<\/li>\n<li>Replicas: Copies of index shards<\/li>\n<\/ul>\n<h2>Prerequisites<\/h2>\n<p>To perform this demo, you need one of the following:<\/p>\n<ul>\n<li>An existing Rancher deployment and Kubernetes cluster, or<\/li>\n<li>Two nodes in which to deploy Rancher and Kubernetes, or<\/li>\n<li>A node in which to deploy Rancher and a Kubernetes cluster running in a hosted provider such as GKE.<\/li>\n<\/ul>\n<p>This article uses the Google Cloud Platform, but you may use any other provider or infrastructure.<\/p>\n<h2>Launch Rancher<\/h2>\n<p>If you don\u2019t already have a Rancher deployment, begin by launching one. The <a href=\"https:\/\/rancher.com\/quick-start\/\">quick start guide<\/a> covers the steps for doing so.<\/p>\n<h2>Launch a Cluster<\/h2>\n<p>Use Rancher to set up and configure your cluster according to the <a href=\"https:\/\/rancher.com\/docs\/rancher\/v2.x\/en\/cluster-provisioning\/\">guide most suited to your environment<\/a>.<\/p>\n<h2>Deploy Elasticsearch<\/h2>\n<p>If you are already comfortable with kubectl, you can apply the manifests directly. If you prefer to use the Rancher user interface, scroll down for those instructions.<\/p>\n<p>We will deploy Elasticsearch as a <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/workloads\/controllers\/statefulset\/\">StatefulSet<\/a> with two Services: a headless service for communicating with the pods and another for interacting with Elasticsearch from outside of the Kubernetes cluster.<\/p>\n<h3>svc-cluster.yaml<\/h3>\n<p>apiVersion: v1<br \/>\nkind: Service<br \/>\nmetadata:<br \/>\nname: elasticsearch-cluster<br \/>\nspec:<br \/>\nclusterIP: None<br \/>\nselector:<br \/>\napp: es-cluster<br \/>\nports:<br \/>\n&#8211; name: transport<br \/>\nport: 9300$ kubectl apply -f svc-cluster.yaml<br \/>\nservice\/elasticsearch-cluster created<\/p>\n<h3>svc-loadbalancer.yaml<\/h3>\n<p>apiVersion: v1<br \/>\nkind: Service<br \/>\nmetadata:<br \/>\nname: elasticsearch-loadbalancer<br \/>\nspec:<br \/>\nselector:<br \/>\napp: es-cluster<br \/>\nports:<br \/>\n&#8211; name: http<br \/>\nport: 80<br \/>\ntargetPort: 9200<br \/>\ntype: LoadBalancer$ kubectl apply -f svc-loadbalancer.yaml<br \/>\nservice\/elasticsearch-loadbalancer created<\/p>\n<h3>es-sts-deployment.yaml<\/h3>\n<p>apiVersion: v1<br \/>\nkind: ConfigMap<br \/>\nmetadata:<br \/>\nname: es-config<br \/>\ndata:<br \/>\nelasticsearch.yml: |<br \/>\ncluster.name: my-elastic-cluster<br \/>\nnetwork.host: &#8220;0.0.0.0&#8221;<br \/>\nbootstrap.memory_lock: false<br \/>\ndiscovery.zen.ping.unicast.hosts: elasticsearch-cluster<br \/>\ndiscovery.zen.minimum_master_nodes: 1<br \/>\nxpack.security.enabled: false<br \/>\nxpack.monitoring.enabled: false<br \/>\nES_JAVA_OPTS: -Xms512m -Xmx512m<br \/>\n&#8212;<br \/>\napiVersion: apps\/v1beta1<br \/>\nkind: StatefulSet<br \/>\nmetadata:<br \/>\nname: esnode<br \/>\nspec:<br \/>\nserviceName: elasticsearch<br \/>\nreplicas: 2<br \/>\nupdateStrategy:<br \/>\ntype: RollingUpdate<br \/>\ntemplate:<br \/>\nmetadata:<br \/>\nlabels:<br \/>\napp: es-cluster<br \/>\nspec:<br \/>\nsecurityContext:<br \/>\nfsGroup: 1000<br \/>\ninitContainers:<br \/>\n&#8211; name: init-sysctl<br \/>\nimage: busybox<br \/>\nimagePullPolicy: IfNotPresent<br \/>\nsecurityContext:<br \/>\nprivileged: true<br \/>\ncommand: [&#8220;sysctl&#8221;, &#8220;-w&#8221;, &#8220;vm.max_map_count=262144&#8221;]<br \/>\ncontainers:<br \/>\n&#8211; name: elasticsearch<br \/>\nresources:<br \/>\nrequests:<br \/>\nmemory: 1Gi<br \/>\nsecurityContext:<br \/>\nprivileged: true<br \/>\nrunAsUser: 1000<br \/>\ncapabilities:<br \/>\nadd:<br \/>\n&#8211; IPC_LOCK<br \/>\n&#8211; SYS_RESOURCE<br \/>\nimage: docker.elastic.co\/elasticsearch\/elasticsearch:6.5.0<br \/>\nenv:<br \/>\n&#8211; name: ES_JAVA_OPTS<br \/>\nvalueFrom:<br \/>\nconfigMapKeyRef:<br \/>\nname: es-config<br \/>\nkey: ES_JAVA_OPTS<br \/>\nreadinessProbe:<br \/>\nhttpGet:<br \/>\nscheme: HTTP<br \/>\npath: \/_cluster\/health?local=true<br \/>\nport: 9200<br \/>\ninitialDelaySeconds: 5<br \/>\nports:<br \/>\n&#8211; containerPort: 9200<br \/>\nname: es-http<br \/>\n&#8211; containerPort: 9300<br \/>\nname: es-transport<br \/>\nvolumeMounts:<br \/>\n&#8211; name: es-data<br \/>\nmountPath: \/usr\/share\/elasticsearch\/data<br \/>\n&#8211; name: elasticsearch-config<br \/>\nmountPath: \/usr\/share\/elasticsearch\/config\/elasticsearch.yml<br \/>\nsubPath: elasticsearch.yml<br \/>\nvolumes:<br \/>\n&#8211; name: elasticsearch-config<br \/>\nconfigMap:<br \/>\nname: es-config<br \/>\nitems:<br \/>\n&#8211; key: elasticsearch.yml<br \/>\npath: elasticsearch.yml<br \/>\nvolumeClaimTemplates:<br \/>\n&#8211; metadata:<br \/>\nname: es-data<br \/>\nspec:<br \/>\naccessModes: [ &#8220;ReadWriteOnce&#8221; ]<br \/>\nresources:<br \/>\nrequests:<br \/>\nstorage: 5Gi$ kubectl apply -f es-sts-deployment.yaml<br \/>\nconfigmap\/es-config created<br \/>\nstatefulset.apps\/esnode created<\/p>\n<h2>Deploy Elasticsearch via the Rancher UI<\/h2>\n<p>If you prefer, import each of the manifests above into your cluster via the Rancher UI. The screenshots below shows the process for each of them.<\/p>\n<h3>Import svc-cluster.yaml<\/h3>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/01-rancher-import-svc-cluster.png\" alt=\"01\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/02-rancher-select-yaml.png\" alt=\"02\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/03-rancher-hit-import1.png\" alt=\"03\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/04-rancher-service-discovery1.png\" alt=\"04\" \/><\/p>\n<h3>Import svc-loadbalancer.yaml<\/h3>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/05-rancher-hit-import2.png\" alt=\"05\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/06-rancher-service-discovery2.png\" alt=\"06\" \/><\/p>\n<h3>Import es-sts-deployment.yaml<\/h3>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/07-rancher-import-workload.png\" alt=\"07\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/08-rancher-hit-import3.png\" alt=\"08\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/09-rancher-workload.png\" alt=\"09\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/10-rancher-volumes.png\" alt=\"10\" \/><\/p>\n<h2>Retrieve the Load Balancer IP<\/h2>\n<p>You\u2019ll need the address of the load balancer that we deployed. You can retrieve this via kubectl or the UI.<\/p>\n<h3>Use the CLI<\/h3>\n<p>$ kubectl get svc elasticsearch-loadbalancer<br \/>\nNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE<br \/>\nelasticsearch-loadbalancer LoadBalancer 10.59.246.186 35.204.239.246 80:30604\/TCP 33m<\/p>\n<h3>Use the UI<\/h3>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/11-rancher-loadbalancer-IP.png\" alt=\"11\" \/><\/p>\n<h2>Test the Cluster<\/h2>\n<p>Use the address we retrieved in the previous step to query the cluster for basic information.<\/p>\n<p>$ curl 35.204.239.246<br \/>\n{<br \/>\n&#8220;name&#8221; : &#8220;d7bDQcH&#8221;,<br \/>\n&#8220;cluster_name&#8221; : &#8220;my-elastic-cluster&#8221;,<br \/>\n&#8220;cluster_uuid&#8221; : &#8220;e3JVAkPQTCWxg2vA3Xywgg&#8221;,<br \/>\n&#8220;version&#8221; : {<br \/>\n&#8220;number&#8221; : &#8220;6.5.0&#8221;,<br \/>\n&#8220;build_flavor&#8221; : &#8220;default&#8221;,<br \/>\n&#8220;build_type&#8221; : &#8220;tar&#8221;,<br \/>\n&#8220;build_hash&#8221; : &#8220;816e6f6&#8221;,<br \/>\n&#8220;build_date&#8221; : &#8220;2018-11-09T18:58:36.352602Z&#8221;,<br \/>\n&#8220;build_snapshot&#8221; : false,<br \/>\n&#8220;lucene_version&#8221; : &#8220;7.5.0&#8221;,<br \/>\n&#8220;minimum_wire_compatibility_version&#8221; : &#8220;5.6.0&#8221;,<br \/>\n&#8220;minimum_index_compatibility_version&#8221; : &#8220;5.0.0&#8221;<br \/>\n},<br \/>\n&#8220;tagline&#8221; : &#8220;You Know, for Search&#8221;<br \/>\n}<\/p>\n<p>Query the cluster for information about its nodes. The asterisk in the master column highlights the current master node.<\/p>\n<p>$ curl 35.204.239.246\/_cat\/nodes?v<br \/>\nip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name<br \/>\n10.56.2.8 24 97 5 0.05 0.12 0.13 mdi &#8211; d7bDQcH<br \/>\n10.56.0.6 28 96 4 0.01 0.05 0.04 mdi * WEOeEqC<\/p>\n<p>Check the available indices:<\/p>\n<p>$ curl 35.204.239.246\/_cat\/indices?v<br \/>\nhealth status index uuid pri rep docs.count docs.deleted store.size pri.store.size<\/p>\n<p>Because this is a fresh install, it doesn\u2019t have any indices or data. To continue this tutorial, we\u2019ll inject some sample data that we can use later. The files that we\u2019ll use are available from the <a href=\"https:\/\/www.elastic.co\/guide\/en\/kibana\/current\/tutorial-load-dataset.html\">Elastic website<\/a>. Download them and then load them with the following commands:<\/p>\n<p>$ curl -H &#8216;Content-Type: application\/x-ndjson&#8217; -XPOST<br \/>\n&#8216;http:\/\/35.204.239.246\/shakespeare\/doc\/_bulk?pretty&#8217; &#8211;data-binary @shakespeare_6.0.json<br \/>\n$ curl -H &#8216;Content-Type: application\/x-ndjson&#8217; -XPOST<br \/>\n&#8216;http:\/\/35.204.239.246\/bank\/account\/_bulk?pretty&#8217; &#8211;data-binary @accounts.json<br \/>\n$ curl -H &#8216;Content-Type: application\/x-ndjson&#8217; -XPOST<br \/>\n&#8216;http:\/\/35.204.239.246\/_bulk?pretty&#8217; &#8211;data-binary @logs.json<\/p>\n<p>When we recheck the indices, we see that we have five new indices with data.<\/p>\n<p>$ curl 35.204.239.246\/_cat\/indices?v<br \/>\nhealth status index uuid pri rep docs.count docs.deleted store.size pri.store.size<br \/>\ngreen open logstash-2015.05.20 MFdWJxnsTISH0Z9Vr0aT3g 5 1 4750 0 49.9mb 25.2mb<br \/>\ngreen open logstash-2015.05.18 lLHV2nzvTOG9mzlpKaG9sg 5 1 4631 0 46.5mb 23.5mb<br \/>\ngreen open logstash-2015.05.19 PqNnVUgXTyaDSfmCQZwbLQ 5 1 4624 0 48.2mb 24.2mb<br \/>\ngreen open shakespeare rwl3xBgmQtm8B3V7GFeTZQ 5 1 111396 0 46mb 23.1mb<br \/>\ngreen open bank z0wVGsbrSiG2cQwRXwaCOg 5 1 1000 0 949.2kb 474.6kb<\/p>\n<p>Each of these contains a different type of document. For the shakespeare index, we can search for the name of a play. For the logstash-2015.05.19 index we can query and filter data based on an IP address, and for the bank index we can search for information about a particular account.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/12-rancher-query1.png\" alt=\"12\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/13-rancher-query2.png\" alt=\"13\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/blog\/2018\/deploying-elasticsearch\/14-rancher-query3.png\" alt=\"14\" \/><\/p>\n<h2>Conclusion<\/h2>\n<p>Elasticsearch is extremely powerful. It is both simple and complex \u2013 simple to deploy and use, and complex in the way that it interacts with its data.<\/p>\n<p>This article has shown you the basics of how to deploy it with <a href=\"https:\/\/www.rancher.com\">Rancher<\/a> and Kubernetes and how to query it via the RESTful API.<\/p>\n<p>If you wish to explore ways to use Elasticsearch in everyday situations, we encourage you to explore the other parts of the ELK stack: <a href=\"https:\/\/www.elastic.co\/products\/kibana\">Kibana<\/a>, <a href=\"https:\/\/www.elastic.co\/products\/logstash\">Logstash<\/a>, and <a href=\"https:\/\/www.elastic.co\/products\/beats\">Beats<\/a>. These tools round out an Elasticsearch deployment and make it useful for storing, retrieving, and visualizing a broad range of data from systems and applications.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/rancher.com\/img\/bio\/calin-rus.jpg\" alt=\"Calin Rus\" width=\"100\" height=\"100\" \/><\/p>\n<p>Calin Rus<\/p>\n<p><a href=\"https:\/\/github.com\/rustudorcalin\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/rancher.com\/img\/icon-github.svg\" alt=\"github\" width=\"30\" \/><\/a><\/p>\n<p><a href=\"https:\/\/rancher.com\/blog\/2018\/2018-11-22-deploying-elasticsearch\/\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Elasticsearch is an open-source search engine based on Apache Lucene and developed by Elastic. It focuses on features like scalability, resilience, and performance, and companies all around the world, including Mozilla, Facebook, Github, Netflix, eBay, the New York Times, and others, use it every day. Elasticsearch is one of the most popular analytics platforms &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.appservgrid.com\/paw93\/index.php\/2019\/03\/05\/deploying-elasticsearch-within-kubernetes\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Deploying Elasticsearch Within Kubernetes&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-1403","post","type-post","status-publish","format-standard","hentry","category-kubernetes"],"_links":{"self":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/1403","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/comments?post=1403"}],"version-history":[{"count":1,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/1403\/revisions"}],"predecessor-version":[{"id":1477,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/1403\/revisions\/1477"}],"wp:attachment":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/media?parent=1403"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/categories?post=1403"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/tags?post=1403"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}