Deploying Elasticsearch Within Kubernetes

Introduction

Elasticsearch is an open-source search engine based on Apache Lucene and developed by Elastic. It focuses on features like scalability, resilience, and performance, and companies all around the world, including Mozilla, Facebook, Github, Netflix, eBay, the New York Times, and others, use it every day. Elasticsearch is one of the most popular analytics platforms for large datasets and is present almost everywhere that you find a search engine. It uses a document-oriented approach when manipulating data, and it can parse it in almost real-time while a user is performing a search. It stores data in JSON and organizes data by index and type.

If we draw analogs between the components of a traditional relational database and those of Elasticsearch, they look like this:

  • Database or Table -> Index
  • Row/Column -> Document with properties

Elasticsearch Advantages

  • It originates from Apache Lucene, which provides the most robust full-text search capabilities of any open source product.
  • It uses a document-oriented architecture to store complex real-world entities as structured JSON documents. By default, it indexes all fields, which provides tremendous performance when searching.
  • It doesn’t use a schema with its indices. Documents add new fields by including them, which gives the freedom to add, remove, or change relevant fields without the downtime associated with a traditional database schema upgrade.
  • It performs linguistic searches against documents, returning those that match the search condition. It scores the results using the TFIDF algorithm, bringing more relevant documents higher up in the list of results.
  • It allows fuzzy searching, which helps find results even with misspelled search terms.
  • It supports real-time search autocompletion, returning results while the user types their search query.
  • It uses a RESTful API, exposing its power via a simple, lightweight interface.
  • Elasticsearch executes complex queries with tremendous speed. It also caches queries, returning cached results for other requests that match a cached filter.
  • It scales horizontally, making it possible to extend resources and balance the load between cluster nodes.
  • It breaks indices into shards, and each shard has any number of replicas. Each node knows the location of every document in the cluster and routes requests internally as necessary to retrieve the data.

Terminology

Elasticsearch uses specific terms to define its components.

  • Cluster: A collection of nodes that work together.
  • Node: A single server that acts as part of the cluster, stores the data, and participates in the cluster’s indexing and search capabilities.
  • Index: A collection of documents with similar characteristics.
  • Document: The basic unit of information that can be indexed.
  • Shards: Indexes are divided into multiple pieces called shards, which allows the index to scale horizontally.
  • Replicas: Copies of index shards

Prerequisites

To perform this demo, you need one of the following:

  • An existing Rancher deployment and Kubernetes cluster, or
  • Two nodes in which to deploy Rancher and Kubernetes, or
  • A node in which to deploy Rancher and a Kubernetes cluster running in a hosted provider such as GKE.

This article uses the Google Cloud Platform, but you may use any other provider or infrastructure.

Launch Rancher

If you don’t already have a Rancher deployment, begin by launching one. The quick start guide covers the steps for doing so.

Launch a Cluster

Use Rancher to set up and configure your cluster according to the guide most suited to your environment.

Deploy Elasticsearch

If you are already comfortable with kubectl, you can apply the manifests directly. If you prefer to use the Rancher user interface, scroll down for those instructions.

We will deploy Elasticsearch as a StatefulSet with two Services: a headless service for communicating with the pods and another for interacting with Elasticsearch from outside of the Kubernetes cluster.

svc-cluster.yaml

apiVersion: v1
kind: Service
metadata:
name: elasticsearch-cluster
spec:
clusterIP: None
selector:
app: es-cluster
ports:
– name: transport
port: 9300$ kubectl apply -f svc-cluster.yaml
service/elasticsearch-cluster created

svc-loadbalancer.yaml

apiVersion: v1
kind: Service
metadata:
name: elasticsearch-loadbalancer
spec:
selector:
app: es-cluster
ports:
– name: http
port: 80
targetPort: 9200
type: LoadBalancer$ kubectl apply -f svc-loadbalancer.yaml
service/elasticsearch-loadbalancer created

es-sts-deployment.yaml

apiVersion: v1
kind: ConfigMap
metadata:
name: es-config
data:
elasticsearch.yml: |
cluster.name: my-elastic-cluster
network.host: “0.0.0.0”
bootstrap.memory_lock: false
discovery.zen.ping.unicast.hosts: elasticsearch-cluster
discovery.zen.minimum_master_nodes: 1
xpack.security.enabled: false
xpack.monitoring.enabled: false
ES_JAVA_OPTS: -Xms512m -Xmx512m

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: esnode
spec:
serviceName: elasticsearch
replicas: 2
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: es-cluster
spec:
securityContext:
fsGroup: 1000
initContainers:
– name: init-sysctl
image: busybox
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
command: [“sysctl”, “-w”, “vm.max_map_count=262144”]
containers:
– name: elasticsearch
resources:
requests:
memory: 1Gi
securityContext:
privileged: true
runAsUser: 1000
capabilities:
add:
– IPC_LOCK
– SYS_RESOURCE
image: docker.elastic.co/elasticsearch/elasticsearch:6.5.0
env:
– name: ES_JAVA_OPTS
valueFrom:
configMapKeyRef:
name: es-config
key: ES_JAVA_OPTS
readinessProbe:
httpGet:
scheme: HTTP
path: /_cluster/health?local=true
port: 9200
initialDelaySeconds: 5
ports:
– containerPort: 9200
name: es-http
– containerPort: 9300
name: es-transport
volumeMounts:
– name: es-data
mountPath: /usr/share/elasticsearch/data
– name: elasticsearch-config
mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
subPath: elasticsearch.yml
volumes:
– name: elasticsearch-config
configMap:
name: es-config
items:
– key: elasticsearch.yml
path: elasticsearch.yml
volumeClaimTemplates:
– metadata:
name: es-data
spec:
accessModes: [ “ReadWriteOnce” ]
resources:
requests:
storage: 5Gi$ kubectl apply -f es-sts-deployment.yaml
configmap/es-config created
statefulset.apps/esnode created

Deploy Elasticsearch via the Rancher UI

If you prefer, import each of the manifests above into your cluster via the Rancher UI. The screenshots below shows the process for each of them.

Import svc-cluster.yaml

01

02

03

04

Import svc-loadbalancer.yaml

05

06

Import es-sts-deployment.yaml

07

08

09

10

Retrieve the Load Balancer IP

You’ll need the address of the load balancer that we deployed. You can retrieve this via kubectl or the UI.

Use the CLI

$ kubectl get svc elasticsearch-loadbalancer
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch-loadbalancer LoadBalancer 10.59.246.186 35.204.239.246 80:30604/TCP 33m

Use the UI

11

Test the Cluster

Use the address we retrieved in the previous step to query the cluster for basic information.

$ curl 35.204.239.246
{
“name” : “d7bDQcH”,
“cluster_name” : “my-elastic-cluster”,
“cluster_uuid” : “e3JVAkPQTCWxg2vA3Xywgg”,
“version” : {
“number” : “6.5.0”,
“build_flavor” : “default”,
“build_type” : “tar”,
“build_hash” : “816e6f6”,
“build_date” : “2018-11-09T18:58:36.352602Z”,
“build_snapshot” : false,
“lucene_version” : “7.5.0”,
“minimum_wire_compatibility_version” : “5.6.0”,
“minimum_index_compatibility_version” : “5.0.0”
},
“tagline” : “You Know, for Search”
}

Query the cluster for information about its nodes. The asterisk in the master column highlights the current master node.

$ curl 35.204.239.246/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.56.2.8 24 97 5 0.05 0.12 0.13 mdi – d7bDQcH
10.56.0.6 28 96 4 0.01 0.05 0.04 mdi * WEOeEqC

Check the available indices:

$ curl 35.204.239.246/_cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size

Because this is a fresh install, it doesn’t have any indices or data. To continue this tutorial, we’ll inject some sample data that we can use later. The files that we’ll use are available from the Elastic website. Download them and then load them with the following commands:

$ curl -H ‘Content-Type: application/x-ndjson’ -XPOST
‘http://35.204.239.246/shakespeare/doc/_bulk?pretty’ –data-binary @shakespeare_6.0.json
$ curl -H ‘Content-Type: application/x-ndjson’ -XPOST
‘http://35.204.239.246/bank/account/_bulk?pretty’ –data-binary @accounts.json
$ curl -H ‘Content-Type: application/x-ndjson’ -XPOST
‘http://35.204.239.246/_bulk?pretty’ –data-binary @logs.json

When we recheck the indices, we see that we have five new indices with data.

$ curl 35.204.239.246/_cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open logstash-2015.05.20 MFdWJxnsTISH0Z9Vr0aT3g 5 1 4750 0 49.9mb 25.2mb
green open logstash-2015.05.18 lLHV2nzvTOG9mzlpKaG9sg 5 1 4631 0 46.5mb 23.5mb
green open logstash-2015.05.19 PqNnVUgXTyaDSfmCQZwbLQ 5 1 4624 0 48.2mb 24.2mb
green open shakespeare rwl3xBgmQtm8B3V7GFeTZQ 5 1 111396 0 46mb 23.1mb
green open bank z0wVGsbrSiG2cQwRXwaCOg 5 1 1000 0 949.2kb 474.6kb

Each of these contains a different type of document. For the shakespeare index, we can search for the name of a play. For the logstash-2015.05.19 index we can query and filter data based on an IP address, and for the bank index we can search for information about a particular account.

12

13

14

Conclusion

Elasticsearch is extremely powerful. It is both simple and complex – simple to deploy and use, and complex in the way that it interacts with its data.

This article has shown you the basics of how to deploy it with Rancher and Kubernetes and how to query it via the RESTful API.

If you wish to explore ways to use Elasticsearch in everyday situations, we encourage you to explore the other parts of the ELK stack: Kibana, Logstash, and Beats. These tools round out an Elasticsearch deployment and make it useful for storing, retrieving, and visualizing a broad range of data from systems and applications.

Calin Rus

Calin Rus

github

Source

Herd your Rancher Labs multi-cloud strategy with Artifactory

DevOps engineers have grown so reliant on the power and scalability of Kubernetes (K8s) clusters that one server platform can seldom accommodate them all. More and more enterprises now run their containerized applications in clusters across multiple platforms at once, in public clouds and on-prem servers.

That can fuel a chaotic stampede in an enterprise-class system – who has control, and which builds do you trust?

Rancher offers a solution for managing multiple K8s clusters, and an enhanced Kubernetes distribution with additional features for central control of those clusters. Rancher’s multi-cluster operations management features provide a unified experience across public and private providers, VMware clusters, and bare metal servers that run in production across your organization, with common policies for provisioning and upgrades.

Kubernetes registry enables trust

While containerized applications help provide great stability through features like immutability and declarative configuration, they don’t guarantee that the software they contain is trusted. Without full control of and visibility into the source and dependencies that go into your containers, elements you don’t want or need can sneak into your builds.

JFrog Artifactory can provide the hybrid Kubernetes registry you need that gives you full visibility into your containers. Artifactory enables trust by giving you insight into your code-to-cluster process while providing visibility into each layer of each application. Moreover, a hybrid K8s registry will help you run applications effectively and safely across all clusters in all of the infrastructure environments you use.

Installing Artifactory with Rancher

Rancher makes it easy for you to install a high-availability instance of Artifactory through its catalog of applications directly into a Kubernetes cluster that you create for Artifactory. In this way, Artifactory instances can run in any of the infrastructure types you use, either on a public cloud or an on-prem server.

To start, install the Rancher Kubernetes Engine (RKE) onto a server and set up an admin account.

Step 1: Add a Cluster

From Rancher’s UI, add a new K8s cluster in the platform where your Artifactory instance will run.

  • You can use a node template for nodes hosted by an infrastructure provider such as Google Cloud Platform (GCP), Amazon Web Services (AWS) or Azure, or set up a custom node for a local on-prem server.
  • For a cluster on a hosted service like GKE, you may need to have a service account created by your support team that provides the privileges that you need.
  • When you create the cluster, select a machine type powerful enough to support Artifactory (recommended minimum is 2 vCPUs, 7.5 Gb RAM)
  • When you have completed your settings, provision the cluster. This may take several minutes to complete.

Step 2: Create a Project and Namespace

You can install Artifactory into the Default Rancher project that is automatically created when adding a cluster. However, it’s a good practice to create a Rancher project and namespace for Artifactory to run in,

For example, a project my-project and a namespace my-project-artifactory:

![Rancher Namespaces](https://rancher.com/img/blog/2018/Jfrog-Rancher-Namespaces.jpg

Step 3: Create a Certificate

The NGINX server used by Artifactory requires a certificate to run.
From the main menu, select Resources > Certificates. In the resulting page, supply the Private Key and Certificate, and assign the Name as artifactory-ha-tls.

![Rancher Certificate](https://rancher.com/img/blog/2018/Jfrog-Rancher-Certificate.png

When complete, click Save.

Step 4: Add a ConfigMap

Artifactory will require a ConfigMap for general configuration information needed by its load balancer.

The following example ConfigMap should be used for a standard setup:

## add HA entries when ha is configure.
upstream artifactory {
server artifactory-ha-artifactory-ha-primary:8081;
server artifactory-ha:8081;
}
## add ssl entries when https has been set in config
ssl_certificate /var/opt/jfrog/nginx/ssl/tls.crt;
ssl_certificate_key /var/opt/jfrog/nginx/ssl/tls.key;
ssl_session_cache shared:SSL:1m;
ssl_prefer_server_ciphers on;
## server configuration
server {
listen 443 ssl;
listen 80 ;
server_name ~(?<repo>.+).jfrog.team jfrog.team;

if ($http_x_forwarded_proto = ”) {
set $http_x_forwarded_proto $scheme;
}
## Application specific logs
## access_log /var/log/nginx/jfrog.team-access.log timing;
## error_log /var/log/nginx/jfrog.team-error.log;
rewrite ^/$ /artifactory/webapp/ redirect;
rewrite ^/artifactory/?(/webapp)?$ /artifactory/webapp/ redirect;
rewrite ^/(v1|v2)/(.*) /artifactory/api/docker/$repo/$1/$2;
chunked_transfer_encoding on;
client_max_body_size 0;
location /artifactory/ {
proxy_read_timeout 2400s;
proxy_pass_header Server;
proxy_cookie_path ~*^/.* /;
if ( $request_uri ~ ^/artifactory/(.*)$ ) {
proxy_pass http://artifactory/artifactory/$1;
}
proxy_pass http://artifactory/artifactory/;
proxy_next_upstream http_503 non_idempotent;
proxy_set_header X-Artifactory-Override-Base-Url $http_x_forwarded_proto://$host:$server_port/artifactory;
proxy_set_header X-Forwarded-Port $server_port;
proxy_set_header X-Forwarded-Proto $http_x_forwarded_proto;
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}

From the main menu, select Resources > Config Maps, then click Add Config Map.

![Rancher ConfigMap](https://rancher.com/img/blog/2018/Jfrog-Rancher-ConfigMap.jpg

  1. In the Name field, enter art-nginx-conf
  2. In the Namespace field, enter the name of the created namespace.
  3. In the Key field, enter artifactory-ha.conf
  4. Copy the example ConfigMap and paste it into the Value field.
  5. Click Save

The ConfigMap will be used when Artifactory is installed.

Step 5: Install Artifactory

Once you have a cluster, project, and namespace that Artifactory can run in, you can install it easily through Rancher’s catalog of applications.

  1. In the Rancher UI, click on Catalog Apps, then click the Launch button.

![Launch Rancher Catalog](https://rancher.com/img/blog/2018/Jfrog-Rancher-Catalog_Launch.jpg

  1. In the catalog, find the JFrog artifactory-ha template marked “Partner”

![Rancher JFrog Catalog Item](https://rancher.com/img/blog/2018/Jfrog-Rancher-Catalog_JFrog.jpg

  1. Click View Details

![Rancher JFrog Catalog Details](https://rancher.com/img/blog/2018/Jfrog-Rancher-Catalog_Install.jpg

Scroll down to set the Configuration Options. Set the name, enable persistent storage, and set the persistent volume size to a value large enough to accommodate your expected needs.

Set the Container Images to use the Default Image, and the Services and Load Balancing settings to use the NGINX server, assign the artifactory-ha-tls secret and the art-nginx-conf ConfigMap that were created in the prior steps.

![Rancher JFrog Catalog Details](https://rancher.com/img/blog/2018/Jfrog-Rancher-Settings.jpg

Set the Database Settings to enable and configure PostgreSQL.

![Rancher JFrog Database Settings](https://rancher.com/img/blog/2018/Jfrog-Rancher-Settings-Database.jpg

![Rancher JFrog Storage Settings](https://rancher.com/img/blog/2018/Jfrog-artifactory-storage.png

Click Launch to perform the installation.

![Rancher JFrog Install Launched](https://rancher.com/img/blog/2018/Jfrog-Rancher-Install_Launch.png

  1. The installation will likely take several minutes to complete. When finished, it will present the JFrog artifactory-ha app as Active. The URL for the Artifactory HA installation is presented as a hotlink (for example, 443/tcp, 80/tcp). Click on the link to access the Artifactory HA application.

![Rancher JFrog Install Completed](https://rancher.com/img/blog/2018/Jfrog-Rancher-Install_Complete.png

Give it a try

Rancher and Artifactory both bring many pieces that would be challenging to manage independently into a single system, bringing control and visibility to the process. Together, they help enforce uniform policies, promotion flow, and more under a set of universal managers, quelling the risk of disorder.

Rancher’s integration of Artifactory through its catalog makes it especially easy to deploy and manage a hybrid Kubernetes Registry across all of the clusters you need across your organization.

If you’re already a Rancher user, you can install Artifactory immediately through the catalog of applications.

![Rancher JFrog Activate](https://rancher.com/img/blog/2018/Jfrog-Rancher-Activate.png

If you are new to Artifactory, you can request a set of three Artifactory Enterprise licenses for a free trial by emailing rancher-jfrog-licenses@jfrog.com.

Jainish Shah

Jainish Shah

JFrog Software Engineer

Source

101 More Security Best Practices for Kubernetes

The CNCF recently released 9 Kubernetes Security Best Practices Everyone Must Follow, in which they outline nine basic actions that they recommend people take with their Kubernetes clusters.

Although their recommendations are a good start, the article leans heavily on GKE. For those of you who are committed to using Google’s services, GKE is a good solution. However, others want to run in Amazon, Azure, DigitalOcean, on their own infrastructure, or anywhere else they can think of, and having solutions that point to GKE doesn’t help them.

For these people, Rancher is a great open source solution.

Rancher Labs takes security seriously. Darren Shepherd, who is one of the founders, discovered the bug that resulted in CVE-2018-1002105 in December, 2018. Security isn’t an afterthought or something you remember to do after you deploy an insecure cluster. You don’t, for example, build a house, move all of your belongings into it, and then put locks on the door.

In this article, I’ll respond to each of the points raised by the CNCF and walk you through how Rancher and RKE satisfy these security recommendations by default.

Upgrade to the Latest Version

This is sound advice that doesn’t only apply to Kubernetes. Unpatched software is the most common entry point for attackers when they breach systems. When a CVE is released and proof of concept code is made publicly available, tool suites such as Metasploit quickly include the exploits in their standard kit. Anyone with the skill to copy and paste commands from the Internet can find themselves in control of your systems.

When using Rancher Kubernetes Engine (RKE), either standalone or when installed by Rancher, you can choose the version of Kubernetes to install. Rancher Labs uses native upstream Kubernetes, and this enables the company to quickly respond to security alerts, releasing patched versions of the software. Because RKE runs the Kubernetes components within Docker containers. operations teams can perform zero-downtime upgrades of critical infrastructure.

I recommend that you follow Rancher Labs on Twitter to receive announcements about new releases. I also strongly recommend that you test new versions in a staging environment before upgrading, but in the event that an upgrade goes awry, Rancher makes it just as easy to roll back to a previous version.

Enable Role-Based Access Control (RBAC)

RKE installs with RBAC enabled by default. If you’re only using RKE, or any other standalone Kubernetes deployment, you’re responsible for configuring the accounts, roles, and bindings to secure your cluster.

If you’re using Rancher, it not only installs secure clusters, but it proxies all communication to those clusters through the Rancher server. Rancher plugs into a number of backend authentication providers, such as Active Directory, LDAP, SAML, Github, and more. When connected in this way, Rancher enables you to extend your existing corporate authentication out to all of the Kubernetes clusters under Rancher’s umbrella, no matter where they’re running.

Rancher Authentication Backends

Rancher enables roles at the global, cluster, and project level, and it makes it possible for administrators to define roles in a single place and apply them to all clusters.

This combination of RBAC-by-default and strong controls for authentication and authorization means that from the moment you deploy a cluster with Rancher or RKE, that cluster is secure.

Rancher Roles Screen

Use Namespaces to Establish Security Boundaries

Because of the special way that Kubernetes treats the default namespace, I don’t recommend that you use it. Instead, create a namespace for each of your applications, defining them as logical groups.

Rancher defines an additional layer of abstraction called a Project. A Project is a collection of namespaces, onto which roles can be mapped. Users with access to one Project cannot see any or interact with any workload running in another Project to which they do not have access. This effectively creates single-cluster multi-tenancy.

Rancher Projects Screen

Using Projects makes it easier for administrators to grant access to multiple namespaces within a single cluster. It minimizes duplicated configuration and reduces human error.

Separate Sensitive Workloads

This is a good suggestion, in that it presumes the question, “what happens if a workload is compromised?” Acting in advance to reduce the blast radius of a breach makes it harder for an attacker to escalate privileges, but it doesn’t make it impossible. If anything, this might buy you additional time.

Kubernetes allows you to set taints and tolerations, which control where a Pod might be deployed.

Rancher also lets you control scheduling of workloads through Kubernetes labels. In addition to taints and tolerations, when deploying a workload you can set the labels that a host must have, should have, or can have for a Pod to land there. You can also schedule workloads to a specific node if your environment is that static.

Rancher Node Scheduling

This suggestion states that it sensitive metadata “can sometimes be stolen or misused,” but it fails to outline the conditions of when or how. The article references a disclosure from Shopify, presented at Kubecon NA on December 13, 2018. Although this piece of the article points out a GKE feature for “metadata concealment,” it’s worth noting that the service which leaked the credentials in the first place was the Google Cloud metadata API.

There is nothing that shows the same vulnerability exists with any other cloud provider.

The only place this vulnerability might exist would be in a hosted Kubernetes service such as GKE. If you deploy RKE onto bare metal or cloud compute instances, either directly or via Rancher, you’ll end up with a cluster that cannot have credentials leaked via the cloud provider’s metadata API.

If you’re using GKE, I recommend that you activate this feature to prevent any credentials from leaking via the metadata service.

I would also argue that cloud providers should never embed credentials into metadata accessible via an API. Even if this exists for convenience, it’s an unnecessary risk with unimaginable consequences.

Create and Define Cluster Network Policies

RKE clusters, deployed directly or by Rancher, use Canal by default, although you can also choose Calico or Flannel. Both Canal and Calico include support for NetworkPolicies. Rancher-deployed clusters, when using Canal as a network provider, also support ProjectNetworkPolicies. When activated, workloads can speak to other workloads within their Project, and the System project, which includes cluster-wide components such as ingress controllers, can communicate with all projects.

Earlier versions of Rancher enabled ProjectNetworkPolicies by default, but this created confusion for some users who weren’t aware of the extra security. To provide the best experience across the entire user base, this feature is now off by default but can be easily activated at launch time or later if you change your mind.

Canal and Project Network Isolation

Run a Cluster-wide Pod Security Policy

A Pod Security Policy (PSP) controls what capabilities and configuration Pods must have in order to run within your cluster. For example, you can block privileged mode, host networking, or containers running as root. When installing a cluster via Rancher or RKE, you choose if you want a restricted PSP enabled by default. If you choose to enable it, your cluster will immediately enforce strong limitations on the workload permissions.

Rancher PSP Configuration

The restricted and unrestricted PSPs are the same within RKE and Rancher, so what they activate at install is identical. Rancher allows an unlimited number of additional PSP templates, all handled at the global level. Administrators define PSPs and then apply them to every cluster that Rancher manages. This, like the RBAC configuration discussed earlier, keeps security configuration in a single place and dramatically simplifies the configuration and application of the policies.

When something is easy to do, more people will do it.

Harden Node Security

This isn’t a Kubernetes-specific suggestion, but it’s a good general policy. Anything that interacts with traffic that you don’t control, such as user traffic hitting an application running within Kubernetes, should be running on nodes with a small attack surface. Disable and uninstall unneeded services. Restrict root access via SSH and require a password for sudo. Use passphrases on SSH keys, or use 2FA, U2F keys, or a service like Krypton to bind keys to devices that your users have. These are examples of basic, standard configurations for secure systems.

Rancher requires nothing on the host beyond a supported version of Docker. RKE requires nothing but SSH access, and it will install the latest version of Docker supported by Kubernetes before continuing to install Kubernetes itself.

If you want to reduce the attack surface even more, take a look at RancherOS, a lightweight Linux operating system that runs all processes as Docker containers. The System Docker runs only the smallest number of processes necessary to provide access and run an instance of Docker in userspace for the actual workloads. Both lightweight and secure, RancherOS is what an operating system should be: secure by default.

Turn on Audit Logging

The Rancher Server runs inside of an RKE cluster, so in addition to the Kubernetes audit logging, it’s important to activate audit logging for API calls to the server itself. This log will show all activities that users execute to any cluster, including what happened, who did it, when they did it, and what cluster they did it to.

It’s also important to ship these logs off of the servers in question. Rancher connects to Splunk, Elasticsearch, Fluentd, Kafka, or any syslog endpoint, and from these you can generate dashboards and alerts for suspicious activity.

Information on enabling audit logging for the Rancher Server is available in our documentation.

For information on enabling audit logging for RKE clusters, please see the next section.

Ongoing Security

It takes more than nine changes to truly secure a Kubernetes cluster. Rancher has a hardening guide and a self assessment guide that cover more than 100 controls from the CIS Benchmark for Securing Kubernetes.

If you’re serious about security, Rancher, RKE, and RancherOS will help you stay that way.

ChangeLog:

  • 2019-01-24: added clarification around ProjectNetworkPolicies and additional images
  • 2019-01-23: added main image at top of article

Adrian Goins

Adrian Goins

Senior Solutions Architect

Adrian has been online since 1986, when he first got his hands on a 300 baud modem for his C64. He fell in love with computers and started writing software in 1988, moving into Unix and Linux and launching a career building Internet infrastructure in 1996. Fluent in languages spoken by humans and computers alike, Adrian is a champion for Rancher and Kubernetes. He is passionate about automation and efficiency, and he loves to teach anyone who wants to learn about technology. When not pushing Kubernetes to its limits, you’ll find him flying drones or working on his farm in the Chilean central valley.

Source

Introduction to Kubernetes Namespaces

Introduction

Kubernetes clusters can manage large numbers of unrelated workloads concurrently and organizations often choose to deploy projects created by separate teams to shared clusters. Even with relatively light use, the number of deployed objects can quickly become unmanageable, slowing down operational responsiveness and increasing the chance of dangerous mistakes.

Kubernetes uses a concept called namespaces to help address the complexity of organizing objects within a cluster. Namespaces allow you to group objects together so you can filter and control them as a unit. Whether applying customized access control policies or separating all of the components for a test environment, namespaces are a powerful and flexible concept for handling objects as a group.

In this article, we’ll discuss how namespaces work, introduce a few common use cases, and cover how to use namespaces to manage your Kubernetes objects. Towards the end, we’ll also take a look at a Rancher feature called projects that builds on and extends the namespaces concept.

What are Namespaces and Why Are They Important?

Namespaces are the organizational mechanism that Kubernetes provides to categorize, filter by, and manage arbitrary groups of objects within a cluster. Each workload object added to a Kubernetes cluster must be placed within exactly one namespace.

Namespaces impart a scope for object names within a cluster. While names must be unique within a namespace, the same name can be used in different namespaces. This can have some important practical benefits for certain scenarios. For example, if you use namespaces to segment application life cycle environments — like development, staging, and production — you can maintain copies of the same objects, with the same names, in each environment.

Namespaces also allow you to easily apply policies to specific slices of your cluster. You can control resource usage by defining ResourceQuota objects, which set limits on consumption on a per-namespace basis. Similarly, when using a CNI (container network interface) that supports network policies on your cluster, like Calico or Canal (Calico for policy with flannel for networking), you can apply a NetworkPolicy to the namespace with rules that dictate how pods can be communicate with one another. Different namespaces can be given different policies.

One of the greatest benefits of using namespaces is being able to take advantage of Kubernetes RBAC (role-based access control). RBAC allows you to develop roles, which group a list of permissions or abilities, under a single name. ClusterRole objects exist to define cluster-wide usage patterns, while the Role object type is applied to a specific namespace, giving greater control and granularity. Once a Role is created, a RoleBinding can grant the defined capabilities to a specific user or group of users within the context of a single namespace. In this way, namespaces let cluster operators map the same policies to organized sets of resources.

Common Namespace Usage Patterns

Namespaces are an incredibly flexible feature that doesn’t impose a specific structure or organizational pattern. That being said, there are some common patterns that many teams find useful.

Mapping Namespaces to Teams or Projects

One convention to use when setting up namespaces is to create one for each discrete project or team. This melds well with many of the namespace characteristics we mentioned earlier.

By giving a team a dedicated namespace, you can allow self-management and autonomy by delegating certain responsibilities with RBAC policies. Adding and removing members from the namespace’s RoleBinding objects is a simple way to control access to the team’s resources. It is also often useful to set resource quotas for teams and projects. This way, you can ensure equitable access to resources based the organization’s business requirements and priorities.

Using Namespaces to Partition Life Cycle Environments

Namespaces are well suited for carving out development, staging, and production environments within cluster. While it recommended to deploy production workloads to an entirely separate cluster to ensure maximum isolation, for smaller teams and projects, namespaces can be a workable solution.

As with the previous use case, network policies, RBAC policies, and quotas are big factors in why this can be successful. The ability to isolate the network to control communication to your components is a fundamental requirement when managing environments. Likewise, namespace-scoped RBAC policies allow operators to set strict permissions for production environments. Quotas help you guarantee access to important resources for your most sensitive environments.

The ability to reuse object names is also helpful here. Objects can be rolled up to new environments as they they are tested and released while retaining their original name. This helps avoid confusion around which objects are analogous across environments and reduces cognitive overhead.

Using Namespaces to Isolate Different Consumers

Another use case that namespaces can help with is segmenting workloads by their intended consumers. For instance, if your cluster provides infrastructure for multiple customers, segmenting by namespace allows you to manage each independently while keeping track of usage for billing purposes.

Once again, namespace features allow you to control network and access policies and define quotas for your consumers. In cases where the offering is fairly generic, namespaces allow you to develop and deploy a different instance of the same templated environment for each of your users. This consistency can make management and troubleshooting significantly easier.

Understanding the Preconfigured Kubernetes Namespaces

Before we take a look at how to create your own namespaces, let’s discuss what Kubernetes sets up automatically. By default, three namespaces are available on new clusters:

  • default: Adding an object to a cluster without providing a namespace will place it within the default namespace. This namespace acts as the main target for new user-added resources until alternative namespaces are established. It cannot be deleted.
  • kube-public: The kube-public namespace is intended to be globally readable to all users with or without authentication. This is useful for exposing any cluster information necessary to bootstrap components. It is primarily managed by Kubernetes itself.
  • kube-system: The kube-system namespace is used for Kubernetes components managed by Kubernetes. As a general rule, avoid adding normal workloads to this namespace. It is intended to be managed directly by the system and as such, it has fairly permissive policies.

While these namespaces effectively segregate user workloads the system-managed workloads, they do not impose any additional structure to help categorize and manage applications. Thankfully, creating and using additional namespaces is very straightforward.

Working with Namespaces

Managing namespaces and the resources they contain is fairly straightforward with kubectl. In this section we will demonstrate some of the most common namespace operations so you can start effectively segmenting your resources.

Viewing Existing Namespaces

To display all namespaces available on a cluster, use use the kubectl get namespaces command:

NAME STATUS AGE
default Active 41d
kube-public Active 41d
kube-system Active 41d

The command will show all available namespaces, whether they are currently active, and the resource’s age.

To get more information about a specific namespace, use the kubectl describe command:

kubectl describe namespace defaultName: default
Labels: field.cattle.io/projectId=p-cmn9g
Annotations: cattle.io/status={“Conditions”:[{“Type”:”ResourceQuotaInit”,”Status”:”True”,”Message”:””,”LastUpdateTime”:”2018-12-17T23:17:48Z”},{“Type”:”InitialRolesPopulated”,”Status”:”True”,”Message”:””,”LastUpda…
field.cattle.io/projectId=c-7tf7d:p-cmn9g
lifecycle.cattle.io/create.namespace-auth=true
Status: Active

No resource quota.

No resource limits.

This command can be used to display the labels and annotations associated with the namespace, as well as any quotas or resource limits that have been applied.

Creating a Namespace

To create a new namespace from the command line, use the kubectl create namespace command. Include the name of the new namespace as the argument for the command:

kubectl create namespace demo-namespacenamespace “demo-namespace” created

You can also create namespaces by applying a manifest from a file. For instance, here is a file that defines the same namespace that we created above:

# demo-namespace.yml
apiVersion: v1
kind: Namespace
metadata:
name: demo-namespace

Assuming the spec above is saved to a file called demo-namespace.yml, you can apply it by typing:

kubectl apply -f demo-namespace.yml

Regardless of how we created the namespace, if we check our available namespaces again, the new namespace should be listed (we use ns, a shorthand for namespaces, the second time around):

NAME STATUS AGE
default Active 41d
demo-namespace Active 2m
kube-public Active 41d
kube-system Active 41d

Our namespace is available and ready to use.

Filtering and Performing Actions by Namespace

If we deploy a workload object to the cluster without specifying a namespace, it will be added to the default namespace:

kubectl create deployment –image nginx demo-nginxdeployment.extensions “demo-nginx” created

We can verify the deployment was created in the default namespace with kubectl describe:

kubectl describe deployment demo-nginx | grep Namespace

If we try to create a deployment with the same name again, we will get an error because of the namespace collision:

kubectl create deployment –image nginx demo-nginxError from server (AlreadyExists): deployments.extensions “demo-nginx” already exists

To apply an action to a different namespace, we must include the –namespace= option in the command. Let’s create a deployment with the same name in the demo-namespace namespace:

kubectl create deployment –image nginx demo-nginx –namespace=demo-namespacedeployment.extensions “demo-nginx” created

This newest deployment was successful even though we’re still using the same deployment name. The namespace provided a different scope for the resource name, avoiding the naming collision we experienced earlier.

To see details about the new deployment, we need to specify the namespace with the –namespace= option again:

kubectl describe deployment demo-nginx –namespace=demo-namespace | grep NamespaceNamespace: demo-namespace

This confirms that we have created another deployment called demo-nginx within our demo-namespace namespace.

Selecting Namespace by Setting the Context

If you want to avoid providing the same namespace for each of your commands, you can change the default namespace that commands will apply to by configuring your kubectl context. This will modify the namespace that actions will apply to when that context is active.

To list your context configuration details, type:

kubectl config get-contextsCURRENT NAME CLUSTER AUTHINFO NAMESPACE
* Default Default Default

The above indicates that we have a single context called Default that is being used. No namespace is specified by the context, so the default namespace applies.

To change the namespace used by that context to our demo-context, we can type:

kubectl config set-context $(kubectl config current-context) –namespace=demo-namespaceContext “Default” modified.

We can verify that the demo-namespace is currently selected by viewing the context configuration again:

kubectl config get-contextsCURRENT NAME CLUSTER AUTHINFO NAMESPACE
* Default Default Default demo-namespace

Validate that our kubectl describe command now uses demo-namespace by default by asking for our demo-nginx deployment without specifying a namespace:

kubectl describe deployment demo-nginx | grep NamespaceNamespace: demo-namespace

Deleting a Namespace and Cleaning Up

If you no longer require a namespace, you can delete it.

Deleting a namespace is very powerful because it not only removes the namespaces, but it also cleans up any resources deployed within it. This can be very convenient, but also incredibly dangerous if you are not careful.

It is always a good idea to list the resources associated with a namespace before deleting to verify the objects that will be removed:

kubectl get all –namespace=demo-namespaceNAME READY STATUS RESTARTS AGE
pod/demo-nginx-676fc7d85d-gkdz2 1/1 Running 0 56m

NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.apps/demo-nginx 1 1 1 1 56m

NAME DESIRED CURRENT READY AGE
replicaset.apps/demo-nginx-676fc7d85d 1 1 1 56m

Once we are comfortable with the scope of the action, we can delete the demo-namespace namespace and all of the resources within it by typing:

kubectl delete namespace demo-namespace

The namespace and its resources will be removed from the cluster:

NAME STATUS AGE
default Active 41d
kube-public Active 41d
kube-system Active 41d

If you previously changed the selected namespace in your kubectl context, you can clear the namespace selection by typing:

kubectl config set-context $(kubectl config current-context) –namespace=Context “Default” modified.

While cleaning up demo resources, remember to remove the original demo-nginx deployment we initially provisioned to the default namespace:

kubectl delete deployment demo-nginx

Your cluster should now be in the state you began with.

Extending Namespaces with Rancher Projects

If you are using Rancher to manage your Kubernetes clusters, you have access to the extended functionality provided by the projects feature. Rancher projects are an additional organizational layer used to bundle multiple namespaces together.

Rancher projects overlay a control structure on top of namespaces that allow you to group namespaces into logical units and apply policy to them. Projects mirror namespaces in most ways, but act as a container for namespaces instead of for individual workload resources. Each namespace in Rancher exists in exactly one project and namespaces inherit all of the policies applied to the project.

By default, Rancher clusters define two projects:

  • Default: This project contains the default namespace.
  • System: This project contains all of the other preconfigured namespaces, including kube-public, kube-system, and any namespaces provisioned by the system.

You can see the projects available within your cluster by visiting the Projects/Namespaces tab after selecting your cluster:

Fig. 1: Rancher projects view

Fig. 1: Rancher projects view

From here, you can add projects by clicking on the Add Project button. When creating a project, you can configure the project members and their access rights and can configure security policies and resource quotas.

You can add a namespace to an existing project by clicking the project’s Add Namespace button. To move a namespace to a different project, select the namespace and then click the Move button. Moving a namespace to a new project switches immediately modifies the permissions and policies applied to the namespace.

Rather than introducing new organizational models, Rancher projects simply apply the same abstractions to namespaces that namespaces apply to workload objects. They fill in some usability gaps if you appreciate namespaces functionality but need an additional layer of control.

Conclusion

In this article, we introduced the concept of Kubernetes namespaces and how they can help organize cluster resources. We discussed how namespaces segment and scope resource names within a cluster and how policies applied at the namespace level can influence user permissions and resource allotment.

Afterwards, we covered some common patterns that teams employ to segment their clusters into logical pieces and we described Kubernetes’ preconfigured namespaces and their purpose. Then we took a look at how to create and work with namespaces within a cluster. We ended by taking a look at Rancher projects and how they extend the namespaces concept by grouping namespaces themselves.

Namespaces are an incredibly straightforward concept that help teams organize cluster resources and compartmentalize complexity. Taking a few minutes to get familiar with their benefits and characteristics can help you configure your clusters effectively and avoid trouble down the road.

Justin Ellingwood

Justin Ellingwood

Rancher Content Manager

Source

What Is Etcd and How Do You Set Up an Etcd Cluster?

Introduction

Etcd is an open-source distributed key-value store created by the CoreOS team, now managed by the Cloud Native Computing Foundation. It is pronounced “et-cee-dee”, making reference to distributing the Unix “/etc” directory, where most global configuration files live, across multiple machines. It serves as the backbone of many distributed systems, providing a reliable way for storing data across a cluster of servers. It works on a variety of operating systems including here Linux, BSD and OS X.

Etcd has the following properties:

  • Fully Replicated: The entire store is available on every node in the cluster
  • Highly Available: Etcd is designed to avoid single points of failure in case of hardware or network issues
  • Consistent: Every read returns the most recent write across multiple hosts
  • Simple: Includes a well-defined, user-facing API (gRPC)
  • Secure: Implements automatic TLS with optional client certificate authentication
  • Fast: Benchmarked at 10,000 writes per second
  • Reliable: The store is properly distributed using the Raft algorithm

How Does Etcd Work?

To understand how Etcd works, it is important to define three key concepts: leaders, elections, and terms. In a Raft-based system, the cluster holds an election to choose a leader for a given term.

Leaders handle all client requests which need cluster consensus. Requests not requiring consensus, like reads, can be processed by any cluster member. Leaders are responsible for accepting new changes, replicating the information to follower nodes, and then committing the changes once the followers verify receipt. Each cluster can only have one leader at any given time.

If a leader dies, or is no longer responsive, the rest of the nodes will begin a new election after a predetermined timeout to select a new leader. Each node maintains a randomized election timer that represents the amount of time the node will wait before calling for a new election and selecting itself as a candidate.

If the node does not hear from the leader before a timeout occurs, the node begins a new election by starting a new term, marking itself as a candidate, and asking for votes from the other nodes. Each node votes for the first candidate that requests its vote. If a candidate receives a vote from the majority of the nodes in the cluster, it becomes the new leader. Since the election timeout differs on each node, the first candidate often becomes the new leader. However, if multiple candidates exist and receive the same number of votes, the existing election term will end without a leader and a new term will begin with new randomized election timers.

As mentioned above, any changes must be directed to the leader node. Rather than accepting and committing the change immediately, Etcd uses the Raft algorithm to ensure that the majority of nodes all agree on the change. The leader sends the proposed new value to each node in the cluster. The nodes then send a message confirming receipt of the new value. If the majority of nodes confirm receipt, the leader commits the new value and messages each node that the value is committed to the log. This means that each change requires a quorum from the cluster nodes in order to be committed.

Etcd in Kubernetes

Since its adoption as part of Kubernetes) in 2014, the Etcd community has grown exponentially. There are many contributing members including CoreOS, Google, Redhat, IBM, Cisco, Huawei and more. Etcd is used successfully in production environments by large cloud providers such as AWS, Google Cloud Platform, and Azure.

Etcd’s job within Kubernetes is to safely store critical data for distributed systems. It’s best known as Kubernetes’ primary datastore used to store its configuration data, state, and metadata. Since Kubernetes usually runs on a cluster of several machines, it is a distributed system that requires a distributed datastore like Etcd.

Etcd makes it easy to store data across a cluster and watch for changes, allowing any node from Kubernetes cluster to read and write data. Etcd’s watch functionality is used by Kubernetes to monitor changes to either the actual or the desired state of its system. If they are different, Kubernetes makes changes to reconcile the two states. Every read by the kubectl command is retrieved from data stored in Etcd, any change made (kubectl apply) will create or update entries in Etcd, and every crash will trigger value changes in etcd.

Deployment and Hardware Recommendations

For testing or development purposes, Etcd can run on a laptop or a light cloud setup. However, when running Etcd clusters in production, we should take in consideration the guidelines offered by Etcd’s official documentation. The page offers a good starting point for a robust production deployment. Things to keep in mind:

  • Since Etcd writes data to disk, SSD is highly recommended
  • Always use an odd number of cluster members as quorum is needed to agree on updates to the cluster state
  • For performance reasons, clusters should usually not have more than seven nodes

Let’s go over the steps required to deploy an Etcd cluster in Kubernetes. Afterward, we will demonstrate some basic CLI commands or API calls. We will use a combination of Kubernetes’ concepts like StatefulSets and PersistentVolumes for our deployment.

Prerequisites

To follow along with this demo, you will need the following:

  • a Google Cloud Platform account: The free tier should be more than enough. You should be able to use most other cloud providers with little modification.
  • A server to run Rancher

Starting a Rancher Instance

To begin, start a Rancher instance on your control server. There is a very intuitive getting started guide for this purpose here.

Using Rancher to Deploy a GKE Cluster

Use Rancher to set up and configure a Kubernetes cluster in your GCP account using this guide.

Install the Google Cloud SDK and kubelet command on the same server hosting our Rancher instance. Install the SDK by following the link provided above, and install kubelet through the Rancher UI.

Make sure that the gcloud command has access to your GCP account by authenticating with gcloud init and gcloud auth login.

As soon as cluster is deployed, check basic kubectl functionality by typing:

NAME STATUS ROLES AGE VERSION
gke-c-ggchf-default-pool-df0bc935-31mv Ready <none> 48s v1.11.6-gke.2
gke-c-ggchf-default-pool-df0bc935-ddl5 Ready <none> 48s v1.11.6-gke.2
gke-c-ggchf-default-pool-df0bc935-qqhx Ready <none> 48s v1.11.6-gke.2

Before deploying the Etcd cluster (through kubectl or by importing YAML files in Rancher’s UI), we need to configure a few items. In GCE, the default persistent disk is pd-standard. We will configure pd-ssd for our Etcd deployment. This is not mandatory, but as per Etcd recommendations, SSD is very good option. Please check this page to learn about other cloud providers’ storage classes.

Let’s check the available storage class that GCE offers. As expected, we see the default one, called standard:

NAME PROVISIONER AGE
standard (default) kubernetes.io/gce-pd 2m

Let’s apply this YAML file, updating the value of zone to match your preferences, so we can benefit of SSD storage:

# storage-class.yaml

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: ssd
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
zone: europe-west4-c # Change this value

If we check again, we can see that, along the default standard class, ssd is now available:

kubectl apply -f storage-class.yaml
kubectl get storageclassstorageclass.storage.k8s.io/ssd created

NAME PROVISIONER AGE
ssd kubernetes.io/gce-pd 7s
standard (default) kubernetes.io/gce-pd 4m

We can now proceed with deploying the Etcd cluster. We will create a StatefulSet with three replicas, each of which have a dedicated volume with the ssd storageClass. We will also need to deploy two services, one for internal cluster communication and the other to access the cluster externally via the API.

When forming the cluster, we need to pass a few parameters to the Etcd binary to the datastore. The listen-client-urls and listen-peer-urls options specify the local addresses the Etcd server uses to accepting incoming connections. Specifying 0.0.0.0 as the IP address means that Etcd will listen for connections on all available interfaces. The advertise-client-urls and initial-advertise-peer-urls parameters specify the addresses Etcd clients or other etcd members should use to contact the etcd server.

The following YAML file defines our two services and the Etcd StatefulSet:

# etcd-sts.yaml

apiVersion: v1
kind: Service
metadata:
name: etcd-client
spec:
type: LoadBalancer
ports:
– name: etcd-client
port: 2379
protocol: TCP
targetPort: 2379
selector:
app: etcd

apiVersion: v1
kind: Service
metadata:
name: etcd
spec:
clusterIP: None
ports:
– port: 2379
name: client
– port: 2380
name: peer
selector:
app: etcd

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: etcd
labels:
app: etcd
spec:
serviceName: etcd
replicas: 3
template:
metadata:
name: etcd
labels:
app: etcd
spec:
containers:
– name: etcd
image: quay.io/coreos/etcd:latest
ports:
– containerPort: 2379
name: client
– containerPort: 2380
name: peer
volumeMounts:
– name: data
mountPath: /var/run/etcd
command:
– /bin/sh
– -c
– |
PEERS=”etcd-0=http://etcd-0.etcd:2380,etcd-1=http://etcd-1.etcd:2380,etcd-2=http://etcd-2.etcd:2380″
exec etcd –name $
–listen-peer-urls http://0.0.0.0:2380
–listen-client-urls http://0.0.0.0:2379
–advertise-client-urls http://$.etcd:2379
–initial-advertise-peer-urls http://$:2380
–initial-cluster-token etcd-cluster-1
–initial-cluster $
–initial-cluster-state new
–data-dir /var/run/etcd/default.etcd
volumeClaimTemplates:
– metadata:
name: data
spec:
storageClassName: ssd
accessModes: [ “ReadWriteOnce” ]
resources:
requests:
storage: 1Gi

Apply the YAML file by typing:

kubectl apply -f etcd-sts.yaml service/etcd-client created
service/etcd created
statefulset.apps/etcd created

After applying the YAML file, we can the resources it defines within the different tabs Rancher offers:

Fig. 1: Etcd StatefulSet as seen in the Rancher Workloads tab

Fig. 1: Etcd StatefulSet as seen in the Rancher Workloads tab

Fig. 2: Etcd Service as seen in the Rancher Service Discovery tab

Fig. 2: Etcd Service as seen in the Rancher Service Discovery tab

Fig. 3: Etcd volume as seen in the Rancher Volumes tab

Fig. 3: Etcd volume as seen in the Rancher Volumes tab

Interacting with Etcd

There are two primary ways to interact with Etcd: either using etcdctl command or directly through the RESTful API. We will cover both of these briefly, but you can find more in depth information and additional examples by visiting the full documentation here and here.

etcdctl is a command-line interface for interacting with an Etcd server. It can be used to perform a variety of actions such as setting, updating, or removing keys, verifying the cluster health, adding or removing Etcd nodes, and generating database snapshots. By default, etcdctl talks to the Etcd server with the v2 API for backward compatibility. If you want etcdctl to speak to Etcd using the v3 API, you must set the version to “3” via the ETCDCTL_API environment variable.

As for the API, every request sent to an Etcd server is a gRPC remote procedure call. This gRPC gateway serves a RESTful proxy that translates HTTP/JSON requests into gRPC messages.

Let’s find the external IP we need to use for API calls:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
etcd ClusterIP None <none> 2379/TCP,2380/TCP 1m
etcd-client LoadBalancer 10.15.247.17 35.204.136.231 2379:30525/TCP 1m
kubernetes ClusterIP 10.15.240.1 <none> 443/TCP 3m

We should also find the names of our three Pods so that we can use the etcdctl command:

NAME READY STATUS RESTARTS AGE
etcd-0 1/1 Running 0 6m
etcd-1 1/1 Running 0 6m
etcd-2 1/1 Running 0 6m

Let’s check the Etcd version. For this, we can use the API or CLI (both v2 and v3). The output will be slightly different depending on your chosen method.

Use this command to contact the API directly:

curl http://35.204.136.231:2379/version{“etcdserver”:”3.3.8″,”etcdcluster”:”3.3.0″}

To check for the version with v2 of the etcdctl client, type:

kubectl exec -it etcd-0 — etcdctl –versionetcdctl version: 3.3.8
API version: 2

To use the etcdctl with v3 of the API, type:

kubectl exec -it etcd-0 — /bin/sh -c ‘ETCDCTL_API=3 etcdctl version’etcdctl version: 3.3.8
API version: 3.3

Next, let’s list the cluster members, just as we did above.

We can query the API with:

curl 35.204.136.231:2379/v2/members{“members”:[{“id”:”2e80f96756a54ca9″,”name”:”etcd-0″,”peerURLs”:[“http://etcd-0.etcd:2380″],”clientURLs”:[“http://etcd-0.etcd:2379”]},{“id”:”7fd61f3f79d97779″,”name”:”etcd-1″,”peerURLs”:[“http://etcd-1.etcd:2380″],”clientURLs”:[“http://etcd-1.etcd:2379”]},{“id”:”b429c86e3cd4e077″,”name”:”etcd-2″,”peerURLs”:[“http://etcd-2.etcd:2380″],”clientURLs”:[“http://etcd-2.etcd:2379”]}]}

With etcdctl using v2 of the API:

kubectl exec -it etcd-0 — etcdctl member list2e80f96756a54ca9: name=etcd-0 peerURLs=http://etcd-0.etcd:2380 clientURLs=http://etcd-0.etcd:2379 isLeader=true
7fd61f3f79d97779: name=etcd-1 peerURLs=http://etcd-1.etcd:2380 clientURLs=http://etcd-1.etcd:2379 isLeader=false
b429c86e3cd4e077: name=etcd-2 peerURLs=http://etcd-2.etcd:2380 clientURLs=http://etcd-2.etcd:2379 isLeader=false

With etcdctl using v3 of the API:

kubectl exec -it etcd-0 — /bin/sh -c ‘ETCDCTL_API=3 etcdctl member list –write-out=table’+——————+———+——–+————————-+————————-+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+——————+———+——–+————————-+————————-+
| 2e80f96756a54ca9 | started | etcd-0 | http://etcd-0.etcd:2380 | http://etcd-0.etcd:2379 |
| 7fd61f3f79d97779 | started | etcd-1 | http://etcd-1.etcd:2380 | http://etcd-1.etcd:2379 |
| b429c86e3cd4e077 | started | etcd-2 | http://etcd-2.etcd:2380 | http://etcd-2.etcd:2379 |
+——————+———+——–+————————-+————————-+

Setting and Retrieving Values in Etcd

The last example we will cover is creating a key and checking it’s value on all the 3 Pods in the Etcd cluster. Then we will kill the leader, etcd-0 in our scenario, and see how a new leader is elected. Finally, once the cluster has recovered, we will verify the value of our previously created key on all members. We will see that there is no data loss, and cluster simply goes on with a different leader.

We can verify that the cluster is initially healthy by typing:

kubectl exec -it etcd-0 — etcdctl cluster-healthmember 2e80f96756a54ca9 is healthy: got healthy result from http://etcd-0.etcd:2379
member 7fd61f3f79d97779 is healthy: got healthy result from http://etcd-1.etcd:2379
member b429c86e3cd4e077 is healthy: got healthy result from http://etcd-2.etcd:2379
cluster is healthy

Next, verify the current leader by typing the following. The last field indicates that etcd-0 is the leader in our cluster:

kubectl exec -it etcd-0 — etcdctl member list2e80f96756a54ca9: name=etcd-0 peerURLs=http://etcd-0.etcd:2380 clientURLs=http://etcd-0.etcd:2379 isLeader=true
7fd61f3f79d97779: name=etcd-1 peerURLs=http://etcd-1.etcd:2380 clientURLs=http://etcd-1.etcd:2379 isLeader=false
b429c86e3cd4e077: name=etcd-2 peerURLs=http://etcd-2.etcd:2380 clientURLs=http://etcd-2.etcd:2379 isLeader=false

Using the API, we will create a key called message and assign it a value. Remember to substitute the IP address you retrieved for your cluster in the command below:

curl http://35.204.136.231:2379/v2/keys/message -XPUT -d value=”Hello world”{“action”:”set”,”node”:{“key”:”/message”,”value”:”Hello world”,”modifiedIndex”:9,”createdIndex”:9}}

The key will have same value regardless the member we query. This helps us validate that the value has been replicated to the other nodes and committed to the log:

kubectl exec -it etcd-0 — etcdctl get message
kubectl exec -it etcd-1 — etcdctl get message
kubectl exec -it etcd-2 — etcdctl get messageHello world
Hello world
Hello world

Demonstrating High Availability and Recovery

Next, we can kill the Etcd cluster leader. This will let us see how a new leader is elected and how the cluster recovers from it’s degraded state. Delete the pod associated with the Etcd leader you discovered above:

kubectl delete pod etcd-0

Now, let’s check the cluster health:

kubectl exec -it etcd-2 — etcdctl cluster-healthfailed to check the health of member 2e80f96756a54ca9 on http://etcd-0.etcd:2379: Get http://etcd-0.etcd:2379/health: dial tcp: lookup etcd-0.etcd on 10.15.240.10:53: no such host
member 2e80f96756a54ca9 is unreachable: [http://etcd-0.etcd:2379] are all unreachable
member 7fd61f3f79d97779 is healthy: got healthy result from http://etcd-1.etcd:2379
member b429c86e3cd4e077 is healthy: got healthy result from http://etcd-2.etcd:2379
cluster is degraded
command terminated with exit code 5

The above message indicates that the cluster is in a degraded state due to the loss of its leader node.

Once Kubernetes responds to the deleted pod by spinning up a new instance, the Etcd cluster should recover:

kubectl exec -it etcd-2 — etcdctl cluster-healthmember 2e80f96756a54ca9 is healthy: got healthy result from http://etcd-0.etcd:2379
member 7fd61f3f79d97779 is healthy: got healthy result from http://etcd-1.etcd:2379
member b429c86e3cd4e077 is healthy: got healthy result from http://etcd-2.etcd:2379
cluster is healthy

We can see that a new leader has been elected by typing:

kubectl exec -it etcd-2 — etcdctl member list2e80f96756a54ca9: name=etcd-0 peerURLs=http://etcd-0.etcd:2380 clientURLs=http://etcd-0.etcd:2379 isLeader=false
7fd61f3f79d97779: name=etcd-1 peerURLs=http://etcd-1.etcd:2380 clientURLs=http://etcd-1.etcd:2379 isLeader=true
b429c86e3cd4e077: name=etcd-2 peerURLs=http://etcd-2.etcd:2380 clientURLs=http://etcd-2.etcd:2379 isLeader=false

In our case, the etcd-1 node was elected as leader.

If we will check the value for the message key again, we can verify that there was no data loss:

kubectl exec -it etcd-0 — etcdctl get message
kubectl exec -it etcd-1 — etcdctl get message
kubectl exec -it etcd-2 — etcdctl get messageHello world
Hello world
Hello world

Conclusion

Etcd is a very powerful, highly available, and reliable distributed key-value store designed for specific use cases. Common examples are storing database connection details, cache settings, feature flags, and more. It was designed to be sequentially consistent, so that every event is stored in the same order throughout the cluster.

We saw how to get an etcd cluster up and running in Kubernetes with the help of Rancher. Afterwards, we were able to play with few basic Etcd commands. In order to learn more about the project, how keys can be organized, how to set TTLs for keys, or how to back up all the data, the official Etcd repo is a great starting point.

Calin Rus

Calin Rus

github

Source

Introducing Multi-Cluster Applications in Rancher 2.2 Preview 2

Introduction

I’m excited to announce the release of Rancher 2.2 Preview 2, which contains a number of powerful features for day two operations on Kubernetes clusters.

Please visit our release page or the release notes to learn more about all of the features we shipped today.

In this article I introduce one of the features: multi-cluster applications. Read on to learn how this will dramatically reduce your workload and increase the reliability of multi-cluster operations.

Overview

If you’ve been using Kubernetes and have two or more clusters, you are familiar with at least one of the following use cases:

  • Applications have higher fault tolerance when deployed across multiple availability zones (AZs)
  • In edge computing scenarios with hundreds of clusters, the same application needs to run on multiple clusters.

In the high reliability use case operators often mitigate the risks associated with running in a single AZ by pulling nodes from a multiple AZs into a single cluster. The problem with this approach is that even though it resists the failure of an AZ, it will not withstand a failure of the cluster itself. The likelihood of a cluster failure is higher than that of an AZ failure, and if the cluster fails, it might affect the applications running within it.

An alternate approach is to run a separate cluster in each AZ and run a copy of the application on each cluster. This process treats each Kubernetes cluster as its own availability zone, but manually maintaining applications on each cluster is both time consuming and prone to error.

The edge computing use case suffers from the same issue as the multi-AZ cluster: maintenance of the application is either manual, time-consuming, and prone to error, or else the operations team has created sophisticated scripts to handle deployments and upgrades. This additional process overhead moves the point of failure to a different location in the workflow. These scripts require support and maintenance, and they introduce a dependency on the person or people with the knowledge to not only codify the process, but also to understand and manually execute the process if the scripts fail.

Beginning with Rancher 2.2 Preview 2, available from the 2.2.0-alpha6 release tag and up, Rancher will simultaneously deploy and upgrade copies of the same application on any number of Kubernetes clusters.

This feature extends our powerful Application Catalog, built on top of the rock-solid Helm package manager for Kubernetes. Before today the Application Catalog features only applied to individual clusters. We’ve added an additional section at the Global level where those with the correct privileges can deploy apps to any cluster managed by Rancher.

For a full demonstration of this feature and other features released with Rancher 2.2 Preview 2, join us for the upcoming online meetup, where we’ll give a live demo of the features and answer any questions you have.

Read on for a quick introduction to how multi-cluster applications work in Rancher.

Feature Quick Start

  • When you log in to Rancher, you will see a list of all the clusters it manages. You will also see a new global menu item – Multi-Cluster Apps.

Rancher Cluster View

  • After clicking Multi-Cluster Apps, you will see two buttons, Manage Catalogs and Launch. Manage Catalogs takes you to the Catalogs configuration screen where you can enable the main Helm repositories or add additional third-party Helm repositories.
  • Click the Launch button to launch a new application.

Multi-Cluster App Launch Screen

  • Rancher now shows a list of all the applications you can deploy. We will select Grafana.

Multi-Cluster App Catalog

  • The next screen asks for configuration details. Choose the settings and the corresponding values using either the form or direct YAML input. The settings you choose here will be common across all the clusters to which Rancher will deploy this application.

Grafana App Launch Screen

  • Under Configuration Options select the target clusters within the Target Projects dropdown. This list not only shows you the available clusters, but it also asks that you choose a specific Project within the destination cluster.

Choosing a Target Cluster

  • Choose an upgrade strategy. For our demo, we will choose “Rolling Update” and provide a batch size of 1 and an interval of 20 seconds. When we do an upgrade in the future, this setting assures that Rancher will update the clusters one at a time, with a delay of 20 seconds between each.

Choosing an Upgrade Strategy

  • If you need to make adjustments to account for the differences between each cluster, you can do so in the Answer Overrides section.

Setting Unique Answers for Clusters

  • When you’re ready, click Launch at the bottom of the page. You will then see a page that shows all of the deployed multi-cluster apps for your installation. Each will show its current status and a list of the target clusters and projects.

Multi-Cluster Apps In Production

  • When an upgrade is available for the application, Rancher will show Upgrade Available as the application status.
  • To initiate an upgrade, click on the hamburger menu (the menu with the three dots) in the Grafana box and choose Upgrade.

Upgrading a Multi-Cluster App

  • Verify that the “Rolling Update” option is selected.
  • Change some settings and click on the Upgrade button at the bottom of the page.

If you navigate to the Workloads tab of the target clusters, you’ll see one of them change its status to Updating. This cluster will update, then Rancher will pause for 20 seconds (or for the interval you chose) before continuing to the next cluster and updating its copy of the application.

Conclusion

Multi-cluster applications will reduce the workload of operations teams and make it possible to deploy and upgrade applications quickly and reliably across all clusters.

To test these features in your lab or development environment, install the latest alpha release. If you have any feedback, please open an issue on Github, or join us in the forums or on Slack.

Ankur Agarwal

Ankur Agarwal

Head of Product Management

Ankur joined Rancher from Qubeship.io, a CA Accelerator that he founded. He has led Product Management at VMware, Mercury Interactive and Oracle. In his spare time, he volunteers at his daughter’s elementary school, helping kids code.

Source

Kubernetes vs Docker Swarm: Comparison of Two Container Orchestration Tools

With the rise of the containerization technology and increased attention from enterprises and technologists in general, more and more containerized applications have been deployed to the cloud. Moreover, research conducted by 451 Research predicts that the application container market will grow dramatically through 2020, which will continue to expand the number of containerized applications being deployed to the cloud.

When the scale and complexity of production-critical containers rises, container orchestration tools come into the picture and become an indispensable part of enterprises’ container management toolbox. Kubernetes and Docker Swarm are two famous and leading players in the container orchestration market and become the essential part of microservices of many enterprises.

Overview of Kubernetes

Kubernetes is an open-source, community-driven Container Orchestration Engine (COE) inspired by a Google project called Borg. Kubernetes is used to orchestrate fleets of containers representing instances of applications that are decoupled from the machines they run on. As the number of containers in a cluster increases to hundreds or thousands of instances, with application components deployed as separate containers, Kubernetes comes to the rescue by providing a framework for deployment, management, auto-scaling, high availability, and related tasks.

Kubernetes allows you to handle various container orchestration related tasks such as scaling containers up or down, automatic failover, distributing workloads among containers hosted on different machines.

Kubernetes follows the traditional client-server type of architecture where the Master node has the global view of the cluster and is responsible for the decision making. Users can interact with the Master node through the REST API, the web UI, and the command line interface (CLI). The Master node interacts with the Worker nodes that host the containerized applications.

Some of the common terminology used within the Kubernetes ecosystem are:

  • Container: Containers are the units of packaging used to bundle application binaries together with their dependencies, configuration, framework, and libraries.
  • Pods: Pods are the deployment units in Kubernetes ecosystem which contains one or more containers together on the same node. Group of containers can work together and share the resources to achieve the same goal.
  • Node: A node is the representation of a single machine in the cluster running Kubernetes applications. A node can be a physical, bare metal machine or a virtual machine.
  • Cluster: Several Nodes are connected to each other to form a cluster to pool resources that are shared by the applications deployed onto the cluster.
  • Persistent Volume: Since the containers can join and leave the computing environment dynamically, local data storage can be volatile. Persistent volumes help store container data more permanently.

Overview of Docker Swarm

Docker Swarm is an alternative, Docker-native Container Orchestration Engine that coordinates container placement and management among multiple Docker Engine hosts. Docker Swarm allows you to communicate directly with swarm instead of communicating with each Docker Engine individually. Docker Swarm architecture comprises two types of nodes called Managers and Workers.

Below are the common terminology used in the Docker Swarm ecosystem:

  • Node: A node is a machine that runs an instance of Docker Engine
  • Swarm: A cluster of Docker Engine instances.
  • Manager Node: Manager nodes distribute and schedule incoming tasks onto the Worker nodes and maintains the cluster state. Manager Nodes can also optionally run services for Worker nodes.
  • Worker Node: Worker nodes are instances of Docker Engine responsible for running applications in containers.
  • Service: A service is an image of a microservice, such as web or database servers.
  • Task: A service scheduled to run on a Worker node.

Comparison of Kubernetes vs Docker Swarm Features

Both Kubernetes and Docker Swarm COEs have advantages and disadvantages , and the best fit will largely depend on your requirements. Below we compare a few features they share.

Feature Kubernetes Docker Swarm Notes
Cluster Setup and Configuration Challenging to install and setup a cluster manually. Several components such as networking, storage, ports, and IP ranges for Pods require proper configuration and fine-tuning. Each of these pieces require planning, effort, and careful attention to instructions. Simple to install and setup a cluster with fewer complexities. A single set of tools is used to setup and configure the cluster. Setting up and configuring a cluster with Kubernetes is more challenging and complicated as it requires more steps that must be carefully followed.

Setting up a cluster with Docker Swarm is quite simple, requiring only two commands once Docker Engine is installed.

Administration Provides a CLI, REST API, and Dashboard to control and monitor a variety of services. Provides a CLI to interact with the services. Kubernetes has a large set of commands to manage a cluster, leading to a steep learning curve. However, these commands provide great flexibility and you also have access to the dashboard GUI to manage your clusters.

Docker Swarm is bound to Docker API commands and has a relatively small learning curve to start managing a cluster.

Auto-scaling Supports auto-scaling policies by monitoring incoming server traffic and automatically scaling up or down based on the resource utilization. Supports scaling up or down with commands. From a technical perspective, it is not practical to manually scale containers up or down. Therefore, Kubernetes is clearly the winner.
Load-balancing Load balancing must be configured manually unless Pods are exposed as services. Uses ingress load balancing and also assigns ports to services automatically. Manual configuration for load balancing in Kubernetes is an extra step, but not very complicated. Automatic load balancing in Docker Swarm is very flexible.
Storage Allows sharing storage volumes between containers within the same Pod. Allows sharing data volumes with any other container on other nodes. Kubernetes deletes the storage volume if the Pod is killed. Docker Swarm deletes storage volume when the container is killed.
Market Share According to Google Trends, as of February 2019, the popularity of the Kubernetes in worldwide web and YouTube searches are about 79% and 75% of peak values, respectively, for the past 12 months. According to Google Trends, as of February 2019, the popularity of Docker Swarm in worldwide web and YouTube searches is at about 5% of peak values for the past 12 months. As can be seen in the Google Trends report, Kubernetes leads the market share in the popularity of web and YouTube searches. Kubernetes dominates this category compared to the less popular Docker Swarm.

Conclusion

In summary, both Kubernetes and Docker Swarm have advantages and disadvantages.

If you require a quick setup and have simple configuration requirements, Docker Swarm may be a good option due to its simplicity and shallow learning curve.

If your application is complex and utilize hundreds of thousands of containers at the production, Kubernetes, with its auto scaling capabilities and high availability policies, is almost certainly the right choice. However, its steep learning curve and longer setup and configuration time can be a bad fit for some users. With additional tooling, like Rancher, some of these administration and maintenance pain points can be mitigated, making the platform more accessible.

Faruk Caglar, PhD

Faruk Caglar, PhD

Cloud Computing Researcher and Solution Architect

Faruk Caglar received his PhD from the Electrical Engineering and Computer Science Department at Vanderbilt University. He is a researcher in the fields of Cloud Computing, Big Data, Internet of Things (IoT) as well as Machine Learning and solution architect for cloud-based applications. He has published several scientific papers and has been serving as reviewer at peer-reviewed journals and conferences. He also has been providing professional consultancy in his research field.

Source

Deploying Redis Cluster on Top of Kubernetes

Continue your education with Kubernetes Master Classes

Learn to navigate the modern development landscape with confidence and get started without leaving home.

Learn More

Introduction

Redis (which stands for REmote DIctionary Server) is an open source, in-memory datastore, often used as a database, cache or message broker. It can store and manipulate high-level data types like lists, maps, sets, and sorted sets. Because Redis accepts keys in a wide range of formats, operations can be executed on the server, which reduces the client’s workload. It holds its database entirely in memory, only using the disk for persistence. Redis is a popular data storage solution and is used by tech giants like GitHub, Pinterest, Snapchat, Twitter, StackOverflow, Flickr, and others.

Why Use Redis?

  • It is incredibly fast. It is written in ANSI C and runs on POSIX systems such as Linux, Mac OS X, and Solaris.
  • Redis is often ranked the most popular key/value database and the most popular NoSQL database used with containers.
  • Its caching solution reduces the number of calls to a cloud database backend.
  • It can be accessed by applications through its client API library.
  • Redis is supported by all of the popular programming languages.
  • It is open source and stable.

Redis Use in the Real World

  • Some Facebook online games have a very high number of score updates. Executing these operations is trivial when using a Redis sorted set, even if there are millions of users and millions of new scores per minute.
  • Twitter stores the timeline for all users within a Redis cluster.
  • Pinterest stores the user follower graphs in a Redis cluster where data is sharded across hundreds of instances.
  • Github uses Redis as a queue.

What is Redis Cluster?

Redis Cluster is a set of Redis instances, designed for scaling a database by partitioning it, thus making it more resilient. Each member in the cluster, whether a primary or a secondary replica, manages a subset of the hash slot. If a master becomes unreachable, then its slave is promoted to master. In a minimal Redis Cluster made up of three master nodes, each with a single slave node (to allow minimal failover), each master node is assigned a hash slot range between 0 and 16,383. Node A contains hash slots from 0 to 5000, node B from 5001 to 10000, node C from 10001 to 16383. Communication inside the cluster is made via an internal bus, using a gossip protocol to propagate information about the cluster or to discover new nodes.

01

Deploying Redis Cluster in Kubernetes

Deploying Redis Cluster within Kubernetes has its challenges, as each Redis instance relies on a configuration file that keeps track of other cluster instances and their roles. For this we need a combination of Kubernetes StatefulSets and PersistentVolumes.

Prerequisites

To perform this demo, you need the following:

  • Rancher
  • A Google Cloud Platform or other cloud provider account. The examples below use GKE, but any other cloud provider will also work.

Starting a Rancher Instance

If you do not have an instance of Rancher, launch one with the instructions in the quickstart.

Use Rancher to Deploy a GKE Cluster

Use Rancher to set up and configure your Kubernetes cluster, following the documentation.

When the cluster is ready, we can check its status via kubectl.

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-c-8dcng-default-pool-08c0a15c-2gpw Ready <none> 1h v1.11.2-gke.18
gke-c-8dcng-default-pool-08c0a15c-4q79 Ready <none> 1h v1.11.2-gke.18
gke-c-8dcng-default-pool-08c0a15c-g9zv Ready <none> 1h v1.11.2-gke.18

Deploy Redis

Continue to deploy Redis Cluster, either by using kubectl to apply the YAML files or by importing them into the Rancher UI. All of the YAML files that we need are listed below.

$ kubectl apply -f redis-sts.yaml
configmap/redis-cluster created
statefulset.apps/redis-cluster created

$ kubectl apply -f redis-svc.yaml
service/redis-cluster created

Click for YAML content

redis-sts.yaml


apiVersion: v1
kind: ConfigMap
metadata:
name: redis-cluster
data:
update-node.sh: |
#!/bin/sh
REDIS_NODES=”/data/nodes.conf”
sed -i -e “/myself/ s/[0-9]\.[0-9]\.[0-9]\.[0-9]/$/” $
exec “$@”
redis.conf: |+
cluster-enabled yes
cluster-require-full-coverage no
cluster-node-timeout 15000
cluster-config-file /data/nodes.conf
cluster-migration-barrier 1
appendonly yes
protected-mode no

apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-cluster
spec:
serviceName: redis-cluster
replicas: 6
selector:
matchLabels:
app: redis-cluster
template:
metadata:
labels:
app: redis-cluster
spec:
containers:
– name: redis
image: redis:5.0.1-alpine
ports:
– containerPort: 6379
name: client
– containerPort: 16379
name: gossip
command: [“/conf/update-node.sh”, “redis-server”, “/conf/redis.conf”]
env:
– name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
volumeMounts:
– name: conf
mountPath: /conf
readOnly: false
– name: data
mountPath: /data
readOnly: false
volumes:
– name: conf
configMap:
name: redis-cluster
defaultMode: 0755
volumeClaimTemplates:
– metadata:
name: data
spec:
accessModes: [ “ReadWriteOnce” ]
resources:
requests:
storage: 1Gi

redis-svc.yaml


apiVersion: v1
kind: Service
metadata:
name: redis-cluster
spec:
type: ClusterIP
ports:
– port: 6379
targetPort: 6379
name: client
– port: 16379
targetPort: 16379
name: gossip
selector:
app: redis-cluster

Verify the Deployment

Check that the Redis nodes are up and running:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
redis-cluster-0 1/1 Running 0 7m
redis-cluster-1 1/1 Running 0 7m
redis-cluster-2 1/1 Running 0 6m
redis-cluster-3 1/1 Running 0 6m
redis-cluster-4 1/1 Running 0 6m
redis-cluster-5 1/1 Running 0 5m

Below are the 6 volumes that we created:

$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-ae61ad5c-f0a5-11e8-a6e0-42010aa40039 1Gi RWO Delete Bound default/data-redis-cluster-0 standard 7m
pvc-b74b6ef1-f0a5-11e8-a6e0-42010aa40039 1Gi RWO Delete Bound default/data-redis-cluster-1 standard 7m
pvc-c4f9b982-f0a5-11e8-a6e0-42010aa40039 1Gi RWO Delete Bound default/data-redis-cluster-2 standard 6m
pvc-cd7af12d-f0a5-11e8-a6e0-42010aa40039 1Gi RWO Delete Bound default/data-redis-cluster-3 standard 6m
pvc-d5bd0ad3-f0a5-11e8-a6e0-42010aa40039 1Gi RWO Delete Bound default/data-redis-cluster-4 standard 6m
pvc-e3206080-f0a5-11e8-a6e0-42010aa40039 1Gi RWO Delete Bound default/data-redis-cluster-5 standard 5m

We can inspect any of the Pods to see its attached volume:

$ kubectl describe pods redis-cluster-0 | grep pvc
Normal SuccessfulAttachVolume 29m attachdetach-controller AttachVolume.Attach succeeded for volume “pvc-ae61ad5c-f0a5-11e8-a6e0-42010aa40039”

The same data is visible within the Rancher UI.

0203

Deploy Redis Cluster

The next step is to form a Redis Cluster. To do this, we run the following command and type yes to accept the configuration. The first three nodes become masters, and the last three become slaves.

$ kubectl exec -it redis-cluster-0 — redis-cli –cluster create –cluster-replicas 1 $(kubectl get pods -l app=redis-cluster -o jsonpath='{.status.podIP}:6379 ‘)

Click for full command output.

>>> Performing hash slots allocation on 6 nodes…
Master[0] -> Slots 0 – 5460
Master[1] -> Slots 5461 – 10922
Master[2] -> Slots 10923 – 16383
Adding replica 10.60.1.13:6379 to 10.60.2.12:6379
Adding replica 10.60.2.14:6379 to 10.60.1.12:6379
Adding replica 10.60.1.14:6379 to 10.60.2.13:6379
M: 2847de6f6e7c8aaa8b0d2f204cf3ff6e8562a75b 10.60.2.12:6379
slots:[0-5460] (5461 slots) master
M: 3f119dcdd4a33aab0107409524a633e0d22bac1a 10.60.1.12:6379
slots:[5461-10922] (5462 slots) master
M: 754823247cf28af9a2a82f61a8caaa63702275a0 10.60.2.13:6379
slots:[10923-16383] (5461 slots) master
S: 47efe749c97073822cbef9a212a7971a0df8aecd 10.60.1.13:6379
replicates 2847de6f6e7c8aaa8b0d2f204cf3ff6e8562a75b
S: e40ae789995dc6b0dbb5bb18bd243722451d2e95 10.60.2.14:6379
replicates 3f119dcdd4a33aab0107409524a633e0d22bac1a
S: 8d627e43d8a7a2142f9f16c2d66b1010fb472079 10.60.1.14:6379
replicates 754823247cf28af9a2a82f61a8caaa63702275a0
Can I set the above configuration? (type ‘yes’ to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
….
>>> Performing Cluster Check (using node 10.60.2.12:6379)
M: 2847de6f6e7c8aaa8b0d2f204cf3ff6e8562a75b 10.60.2.12:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 47efe749c97073822cbef9a212a7971a0df8aecd 10.60.1.13:6379
slots: (0 slots) slave
replicates 2847de6f6e7c8aaa8b0d2f204cf3ff6e8562a75b
M: 754823247cf28af9a2a82f61a8caaa63702275a0 10.60.2.13:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
M: 3f119dcdd4a33aab0107409524a633e0d22bac1a 10.60.1.12:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: e40ae789995dc6b0dbb5bb18bd243722451d2e95 10.60.2.14:6379
slots: (0 slots) slave
replicates 3f119dcdd4a33aab0107409524a633e0d22bac1a
S: 8d627e43d8a7a2142f9f16c2d66b1010fb472079 10.60.1.14:6379
slots: (0 slots) slave
replicates 754823247cf28af9a2a82f61a8caaa63702275a0
[OK] All nodes agree about slots configuration.
>>> Check for open slots…
>>> Check slots coverage…
[OK] All 16384 slots covered.

Verify Cluster Deployment

Check the cluster details and the role for each member.

$ kubectl exec -it redis-cluster-0 — redis-cli cluster info

Click for full command output

cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:41
cluster_stats_messages_pong_sent:41
cluster_stats_messages_sent:82
cluster_stats_messages_ping_received:36
cluster_stats_messages_pong_received:41
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:82

$ for x in $(seq 0 5); do echo “redis-cluster-$x”; kubectl exec redis-cluster-$x — redis-cli role; echo; done

Click for full command output

redis-cluster-0
1) “master”
2) (integer) 56
3) 1) 1) “10.60.1.13”
2) “6379”
3) “56”

redis-cluster-1
1) “master”
2) (integer) 70
3) 1) 1) “10.60.2.14”
2) “6379”
3) “70”

redis-cluster-2
1) “master”
2) (integer) 70
3) 1) 1) “10.60.1.14”
2) “6379”
3) “70”

redis-cluster-3
1) “slave”
2) “10.60.2.12”
3) (integer) 6379
4) “connected”
5) (integer) 84

redis-cluster-4
1) “slave”
2) “10.60.1.12”
3) (integer) 6379
4) “connected”
5) (integer) 98

redis-cluster-5
1) “slave”
2) “10.60.2.13”
3) (integer) 6379
4) “connected”
5) (integer) 98

Testing the Redis Cluster

We want to use the cluster and then simulate a failure of a node. For the former task, we’ll deploy a simple Python app, and for the latter, we’ll delete a node and observe the cluster behavior.

Deploy the Hit Counter App

We’ll deploy a simple app into our cluster and put a load balancer in front of it. The purpose of this app is to increment a counter and store the value in the Redis cluster before returning the counter value as an HTTP response.

Deploy this using kubectl or the Rancher UI.

$ kubectl apply -f app-deployment-service.yaml
service/hit-counter-lb created
deployment.apps/hit-counter-app created

Click for YAML content.

app-deployment-service.yaml


apiVersion: v1
kind: Service
metadata:
name: hit-counter-lb
spec:
type: LoadBalancer
ports:
– port: 80
protocol: TCP
targetPort: 5000
selector:
app: myapp

apiVersion: apps/v1
kind: Deployment
metadata:
name: hit-counter-app
spec:
replicas: 1
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
– name: myapp
image: calinrus/api-redis-ha:1.0
ports:
– containerPort: 5000

Rancher shows us the resources that we created: a Pod containing the Python app, and the Service of type LoadBalancer. The details of the Service will show us its public IP address.

0405

At this point, we can start hitting the IP with a browser to generate some values for the hit counter.

06

Simulate a Node Failure

We can simulate the failure of a cluster member by deleting the Pod, either via kubectl or from within the Rancher UI. When we delete redis-cluster-0, which was originally a master, we see that Kubernetes promotes redis-cluster-3 to master, and when redis-cluster-0 returns, it does so as a slave.

07

Before

$ kubectl describe pods redis-cluster-0 | grep IP
IP: 10.28.0.5
POD_IP: (v1:status.podIP)

$ kubectl describe pods redis-cluster-3 | grep IP
IP: 10.28.0.6
POD_IP: (v1:status.podIP)

$ kubectl exec -it redis-cluster-0 — redis-cli role
1) “master”
2) (integer) 1859
3) 1) 1) “10.28.0.6”
2) “6379”
3) “1859”

$ kubectl exec -it redis-cluster-3 — redis-cli role
1) “slave”
2) “10.28.0.5”
3) (integer) 6379
4) “connected”
5) (integer) 1859

After

$ kubectl exec -it redis-cluster-0 — redis-cli role
1) “slave”
2) “10.28.0.6”
3) (integer) 6379
4) “connected”
5) (integer) 2111

$ kubectl exec -it redis-cluster-3 — redis-cli role
1) “master”
2) (integer) 2111
3) 1) 1) “10.28.2.12”
2) “6379”
3) “2111”

We see that the IP for redis-cluster-0 has changed, so how did the cluster heal?

When we created the cluster, we created a ConfigMap that in turn created a script at /conf/update-node.sh that the container calls when starting. This script updates the Redis configuration with the new IP address of the local node. With the new IP in the confic, the cluster can heal after a new Pod starts with a different IP address.

During this process, if we continue to load the page, the counter continues to increment, and following the cluster convergence, we see that no data has been lost.

08

Conclusion

Redis is a powerful tool for data storage and caching. Redis Cluster extends the functionality by offering sharding and correlated performance benefits, linear scaling, and higher availability because of how Redis stores data. The data is automatically split among multiple nodes, which allows operations to continue, even when a subset of the nodes are experiencing failures or are unable to communicate with the rest of the cluster.

For more information on Redis Cluster, please visit the tutorial or specification documentation.

For more information on Rancher, please visit our main website or our documentation.

Calin Rus

Calin Rus

github

Source

Automate Operations on your Cluster with OperatorHub.io

Automate Operations on your Cluster with OperatorHub.io

Author:
Diane Mueller, Director of Community Development, Cloud Platforms, Red Hat

One of the important challenges facing developers and Kubernetes administrators has been a lack of ability to quickly find common services that are operationally ready for Kubernetes. Typically, the presence of an Operator for a specific service – a pattern that was introduced in 2016 and has gained momentum – is a good signal for the operational readiness of the service on Kubernetes. However, there has to date not existed a registry of Operators to simplify the discovery of such services.

To help address this challenge, today Red Hat is launching OperatorHub.io in collaboration with AWS, Google Cloud and Microsoft. OperatorHub.io enables developers and Kubernetes administrators to find and install curated Operator-backed services with a base level of documentation, active maintainership by communities or vendors, basic testing, and packaging for optimized life-cycle management on Kubernetes.

The Operators currently in OperatorHub.io are just the start. We invite the Kubernetes community to join us in building a vibrant community for Operators by developing, packaging, and publishing Operators on OperatorHub.io.

What does OperatorHub.io provide?

OperatorHub.io is designed to address the needs of both Kubernetes developers and users. For the former it provides a common registry where they can publish their Operators alongside with descriptions, relevant details like version, image, code repository and have them be readily packaged for installation. They can also update already published Operators to new versions when they are released.

Users get the ability to discover and download Operators at a central location, that has content which has been screened for the previously mentioned criteria and scanned for known vulnerabilities. In addition, developers can guide users of their Operators with prescriptive examples of the CustomResources that they introduce to interact with the application.

What is an Operator?

Operators were first introduced in 2016 by CoreOS and have been used by Red Hat and the Kubernetes community as a way to package, deploy and manage a Kubernetes-native application. A Kubernetes-native application is an application that is both deployed on Kubernetes and managed using the Kubernetes APIs and well-known tooling, like kubectl.

An Operator is implemented as a custom controller that watches for certain Kubernetes resources to appear, be modified or deleted. These are typically CustomResourceDefinitions that the Operator “owns.” In the spec properties of these objects the user declares the desired state of the application or the operation. The Operator’s reconciliation loop will pick these up and perform the required actions to achieve the desired state. For example, the intent to create a highly available etcd cluster could be expressed by creating an new resource of type EtcdCluster:

apiVersion: “etcd.database.coreos.com/v1beta2”
kind: “EtcdCluster”
metadata:
name: “my-etcd-cluster”
spec:
size: 3
version: “3.3.12”

The EtcdOperator would be responsible for creating a 3-node etcd cluster running version v3.3.12 as a result. Similarly, an object of type EtcdBackup could be defined to express the intent to create a consistent backup of the etcd database to an S3 bucket.

How do I create and run an Operator?

One way to get started is with the Operator Framework, an open source toolkit that provides an SDK, lifecycle management, metering and monitoring capabilities. It enables developers to build, test, and package Operators. Operators can be implemented in several programming and automation languages, including Go, Helm, and Ansible, all three of which are supported directly by the SDK.

If you are interested in creating your own Operator, we recommend checking out the Operator Framework to get started.

Operators vary in where they fall along the capability spectrum ranging from basic functionality to having specific operational logic for an application to automate advanced scenarios like backup, restore or tuning. Beyond basic installation, advanced Operators are designed to handle upgrades more seamlessly and react to failures automatically. Currently, Operators on OperatorHub.io span the maturity spectrum, but we anticipate their continuing maturation over time.

While Operators on OperatorHub.io don’t need to be implemented using the SDK, they are packaged for deployment through the Operator Lifecycle Manager (OLM). The format mainly consists of a YAML manifest referred to as [ClusterServiceVersion](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/Documentation/design/building-your-csv.md) which provides information about the CustomResourceDefinitions the Operator owns or requires, which RBAC definition it needs, where the image is stored, etc. This file is usually accompanied by additional YAML files which define the Operators’ own CRDs. This information is processed by OLM at the time a user requests to install an Operator to provide dependency resolution and automation.

What does listing of an Operator on OperatorHub.io mean?

To be listed, Operators must successfully show cluster lifecycle features, be packaged as a CSV to be maintained through OLM, and have acceptable documentation for its intended users.

Some examples of Operators that are currently listed on OperatorHub.io include: Amazon Web Services Operator, Couchbase Autonomous Operator, CrunchyData’s PostgreSQL, etcd Operator, Jaeger Operator for Kubernetes, Kubernetes Federation Operator, MongoDB Enterprise Operator, Percona MySQL Operator, PlanetScale’s Vitess Operator, Prometheus Operator, and Redis Operator.

Want to add your Operator to OperatorHub.io? Follow these steps

If you have an existing Operator, follow the contribution guide using a fork of the community-operators repository. Each contribution contains the CSV, all of the CustomResourceDefinitions, access control rules and references to the container image needed to install and run your Operator, plus other info like a description of its features and supported Kubernetes versions. A complete example, including multiple versions of the Operator, can be found with the EtcdOperator.

After testing out your Operator on your own cluster, submit a PR to the community repository with all of YAML files following this directory structure. Subsequent versions of the Operator can be published in the same way. At first this will be reviewed manually, but automation is on the way. After it’s merged by the maintainers, it will show up on OperatorHub.io along with its documentation and a convenient installation method.

Want to learn more?

Source