A Guide to Kubernetes Admission Controllers

Kubernetes has greatly improved the speed and manageability of backend clusters in production today. Kubernetes has emerged as the de facto standard in container orchestrators thanks to its flexibility, scalability, and ease of use. Kubernetes also provides a range of features that secure production workloads. A more recent introduction in security features is a set of plugins called “admission controllers.” Admission controllers must be enabled to use some of the more advanced security features of Kubernetes, such as pod security policies that enforce a security configuration baseline across an entire namespace. The following must-know tips and tricks will help you leverage admission controllers to make the most of these security capabilities in Kubernetes.

What are Kubernetes admission controllers?

In a nutshell, Kubernetes admission controllers are plugins that govern and enforce how the cluster is used. They can be thought of as a gatekeeper that intercept (authenticated) API requests and may change the request object or deny the request altogether. The admission control process has two phases: the mutating phase is executed first, followed by the validating phase. Consequently, admission controllers can act as mutating or validating controllers or as a combination of both. For example, the LimitRanger admission controller can augment pods with default resource requests and limits (mutating phase), as well as verify that pods with explicitly set resource requirements do not exceed the per-namespace limits specified in the LimitRange object (validating phase).

Admission Controller Phases
Admission Controller Phases

It is worth noting that some aspects of Kubernetes’ operation that many users would consider built-in are in fact governed by admission controllers. For example, when a namespace is deleted and subsequently enters the Terminating state, the NamespaceLifecycle admission controller is what prevents any new objects from being created in this namespace.

Among the more than 30 admission controllers shipped with Kubernetes, two take a special role because of their nearly limitless flexibility – ValidatingAdmissionWebhooks and MutatingAdmissionWebhooks, both of which are in beta status as of Kubernetes 1.13. We will examine these two admission controllers closely, as they do not implement any policy decision logic themselves. Instead, the respective action is obtained from a REST endpoint (a webhook) of a service running inside the cluster. This approach decouples the admission controller logic from the Kubernetes API server, thus allowing users to implement custom logic to be executed whenever resources are created, updated, or deleted in a Kubernetes cluster.

The difference between the two kinds of admission controller webhooks is pretty much self-explanatory: mutating admission webhooks may mutate the objects, while validating admission webhooks may not. However, even a mutating admission webhook can reject requests and thus act in a validating fashion. Validating admission webhooks have two main advantages over mutating ones: first, for security reasons it might be desirable to disable the MutatingAdmissionWebhook admission controller (or apply stricter RBAC restrictions as to who may create MutatingWebhookConfigurationobjects) because of its potentially confusing or even dangerous side effects. Second, as shown in the previous diagram, validating admission controllers (and thus webhooks) are run after any mutating ones. As a result, whatever request object a validating webhook sees is the final version that would be persisted to etcd.

The set of enabled admission controllers is configured by passing a flag to the Kubernetes API server. Note that the old –admission-control flag was deprecated in 1.10 and replaced with –enable-admission-plugins.

--enable-admission-plugins=ValidatingAdmissionWebhook,MutatingAdmissionWebhook

Kubernetes recommends the following admission controllers to be enabled by default.

--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,Priority,ResourceQuota,PodSecurityPolicy

The complete list of admission controllers with their descriptions can be found in the official Kubernetes reference. This discussion will focus only on the webhook-based admission controllers.

Why do I need admission controllers?

  • Security: Admission controllers can increase security by mandating a reasonable security baseline across an entire namespace or cluster. The built-in PodSecurityPolicy admission controller is perhaps the most prominent example; it can be used for disallowing containers from running as root or making sure the container’s root filesystem is always mounted read-only, for example. Further use cases that can be realized by custom, webhook-based admission controllers include:
    • Allow pulling images only from specific registries known to the enterprise, while denying unknown image registries.
    • Reject deployments that do not meet security standards. For example, containers using the privileged flag can circumvent a lot of security checks. This risk could be mitigated by a webhook-based admission controller that either rejects such deployments (validating) or overrides the privileged flag, setting it to false.
  • Governance: Admission controllers allow you to enforce the adherence to certain practices such as having good labels, annotations, resource limits, or other settings. Some of the common scenarios include:
    • Enforce label validation on different objects to ensure proper labels are being used for various objects, such as every object being assigned to a team or project, or every deployment specifying an app label.
    • Automatically add annotations to objects, such as attributing the correct cost center for a “dev” deployment resource.
  • Configuration management: Admission controllers allow you to validate the configuration of the objects running in the cluster and prevent any obvious misconfigurations from hitting your cluster. Admission controllers can be useful in detecting and fixing images deployed without semantic tags, such as by:
    • automatically adding resource limits or validating resource limits,
    • ensuring reasonable labels are added to pods, or
    • ensuring image references used in production deployments are not using the latest tags, or tags with a -dev suffix.

In this way, admission controllers and policy management help make sure that applications stay in compliance within an ever-changing landscape of controls.

Example: Writing and Deploying an Admission Controller Webhook

To illustrate how admission controller webhooks can be leveraged to establish custom security policies, let’s consider an example that addresses one of the shortcomings of Kubernetes: a lot of its defaults are optimized for ease of use and reducing friction, sometimes at the expense of security. One of these settings is that containers are by default allowed to run as root (and, without further configuration and no USER directive in the Dockerfile, will also do so). Even though containers are isolated from the underlying host to a certain extent, running containers as root does increase the risk profile of your deployment— and should be avoided as one of many security best practices. The recently exposed runC vulnerability (CVE-2019-5736), for example, could be exploited only if the container ran as root.

You can use a custom mutating admission controller webhook to apply more secure defaults: unless explicitly requested, our webhook will ensure that pods run as a non-root user (we assign the user ID 1234 if no explicit assignment has been made). Note that this setup does not prevent you from deploying any workloads in your cluster, including those that legitimately require running as root. It only requires you to explicitly enable this risker mode of operation in the deployment configuration, while defaulting to non-root mode for all other workloads.

The full code along with deployment instructions can be found in our accompanying GitHub repository. Here, we will highlight a few of the more subtle aspects about how webhooks work.

Mutating Webhook Configuration

A mutating admission controller webhook is defined by creating a MutatingWebhookConfiguration object in Kubernetes. In our example, we use the following configuration:

apiVersion: admissionregistration.k8s.io/v1beta1
kind: MutatingWebhookConfiguration
metadata:
  name: demo-webhook
webhooks:
  - name: webhook-server.webhook-demo.svc
    clientConfig:
      service:
        name: webhook-server
        namespace: webhook-demo
        path: "/mutate"
      caBundle: ${CA_PEM_B64}
    rules:
      - operations: [ "CREATE" ]
        apiGroups: [""]
        apiVersions: ["v1"]
        resources: ["pods"]

This configuration defines a webhook webhook-server.webhook-demo.svc, and instructs the Kubernetes API server to consult the service webhook-server in namespace webhook-demo whenever a pod is created by making a HTTP POST request to the /mutate URL. For this configuration to work, several prerequisites have to be met.

Webhook REST API

The Kubernetes API server makes an HTTPS POST request to the given service and URL path, with a JSON-encoded AdmissionReview (with the Request field set) in the request body. The response should in turn be a JSON-encoded AdmissionReview, this time with the Response field set.

Our demo repository contains a function that takes care of the serialization/deserialization boilerplate code and allows you to focus on implementing the logic operating on Kubernetes API objects. In our example, the function implementing the admission controller logic is called applySecurityDefaults, and an HTTPS server serving this function under the /mutate URL can be set up as follows:

mux := http.NewServeMux()
mux.Handle("/mutate", admitFuncHandler(applySecurityDefaults))
server := &http.Server{
  Addr:    ":8443",
  Handler: mux,
}
log.Fatal(server.ListenAndServeTLS(certPath, keyPath))

Note that for the server to run without elevated privileges, we have the HTTP server listen on port 8443. Kubernetes does not allow specifying a port in the webhook configuration; it always assumes the HTTPS port 443. However, since a service object is required anyway, we can easily map port 443 of the service to port 8443 on the container:

apiVersion: v1
kind: Service
metadata:
  name: webhook-server
  namespace: webhook-demo
spec:
  selector:
    app: webhook-server  # specified by the deployment/pod
  ports:
    - port: 443
      targetPort: webhook-api  # name of port 8443 of the container

Object Modification Logic

In a mutating admission controller webhook, mutations are performed via JSON patches. While the JSON patch standard includes a lot of intricacies that go well beyond the scope of this discussion, the Go data structure in our example as well as its usage should give the user a good initial overview of how JSON patches work:

type patchOperation struct {
  Op    string      `json:"op"`
  Path  string      `json:"path"`
  Value interface{} `json:"value,omitempty"`
}

For setting the field .spec.securityContext.runAsNonRoot of a pod to true, we construct the following patchOperation object:

patches = append(patches, patchOperation{
  Op:    "add",
  Path:  "/spec/securityContext/runAsNonRoot",
  Value: true,
})

TLS Certificates

Since a webhook must be served via HTTPS, we need proper certificates for the server. These certificates can be self-signed (rather: signed by a self-signed CA), but we need Kubernetes to instruct the respective CA certificate when talking to the webhook server. In addition, the common name (CN) of the certificate must match the server name used by the Kubernetes API server, which for internal services is <service-name>.<namespace>.svc, i.e., webhook-server.webhook-demo.svc in our case. Since the generation of self-signed TLS certificates is well documented across the Internet, we simply refer to the respective shell script in our example.

The webhook configuration shown previously contains a placeholder ${CA_PEM_B64}. Before we can create this configuration, we need to replace this portion with the Base64-encoded PEM certificate of the CA. The openssl base64 -A command can be used for this purpose.

Testing the Webhook

After deploying the webhook server and configuring it, which can be done by invoking the ./deploy.sh script from the repository, it is time to test and verify that the webhook indeed does its job. The repository contains three examples:

  • A pod that does not specify a security context (pod-with-defaults). We expect this pod to be run as non-root with user id 1234.
  • A pod that does specify a security context, explicitly allowing it to run as root (pod-with-override).
  • A pod with a conflicting configuration, specifying it must run as non-root but with a user id of 0 (pod-with-conflict). To showcase the rejection of object creation requests, we have augmented our admission controller logic to reject such obvious misconfigurations.

Create one of these pods by running kubectl create -f examples/<name>.yaml. In the first two examples, you can verify the user id under which the pod ran by inspecting the logs, for example:

$ kubectl create -f examples/pod-with-defaults.yaml
$ kubectl logs pod-with-defaults
I am running as user 1234

In the third example, the object creation should be rejected with an appropriate error message:

$ kubectl create -f examples/pod-with-conflict.yaml
Error from server (InternalError): error when creating "examples/pod-with-conflict.yaml": Internal error occurred: admission webhook "webhook-server.webhook-demo.svc" denied the request: runAsNonRoot specified, but runAsUser set to 0 (the root user)

Feel free to test this with your own workloads as well. Of course, you can also experiment a little bit further by changing the logic of the webhook and see how the changes affect object creation. More information on how to do experiment with such changes can be found in the repository’s readme.

Summary

Kubernetes admission controllers offer significant advantages for security. Digging into two powerful examples, with accompanying available code, will help you get started on leveraging these powerful capabilities.

References:

Source

Tarmak 0.6 released

We are excited to announce the release of Tarmak, 0.6! If unfamiliar,
Tarmak is a CLI toolkit to provision and manage
Kubernetes clusters on AWS with security-first principles. This new release
gives a host of great new features and improvements which I’ll describe below.

  • Worker node AMI images
  • Pre-Built Default AMI Image
  • Calico Kubernetes Backend
  • New CLI commands – cluster logs and environment destroy
  • Using Kubernetes Addon-manager
  • Using an in package solution to SSH with a secure approach to public key
    advertising

Runway

Worker Node AMI Images and Default Image

In this release we have a new image type that can be assigned to your worker
instance pools – centos-puppet-agent-k8s-worker. This image type causes
Tarmak to pre-install all the node components when building the AMI image,
rather than installing them at boot time. This means that time from boot to
node status Ready is greatly reduced, giving more resources to your triggered
scaling groups faster.

We have also created a public AMI image. If no privately built images are
available for your cluster, Tarmak will use the Jetstack’s published image
instead. This change is great for new users as they can get a new cluster up and
running faster, without having to wait for long build times.

Calico Kubernetes Backend

We’ve added new options for how you deploy Calico into your clusters. Instead of
using Etcd, the default Calico backend, we now give the option to use Kubernetes
with a toggle in the Tarmak configuration. Deploying a huge cluster? With this
option you can also choose to deploy
Typha which will help with the load of
Calico on the Kubernetes backend. This is also simply enabled and configured
through the Tarmak configuration which you can read how
here.

New CLI Commands

In the unfortunate event you’re having issues with your cluster and seeking some
support, it is always a pain to copy and paste logs from your components running
on multiple machines. This is very time consuming and always seems like the logs
you missed are the most needed! To help with this, we’ve created a new command
cluster logs that will go ahead and fetch all systemd logs from your targeted
instance pools (vault, workers, control-plane etc.), bundle them up into a
reader friendly file structure and compressed into a tar ball. This is then
ready to be shipped off to someone else over the net. This is really beneficial
in making the support feedback loop more efficient and a great quality of life
improvement.

Another CLI command change that we’ve added is the addition of environment
destroy. As it sounds, this is the big brother to cluster destory and will
destroy all clusters in the environment, including the hub. This is a command
that’s helped us a lot internally, and is another nice quality of life
+improvement. Do be careful though that you’re sure you want to run it!

Kubernetes Addon-manager

We are now using the Kubernetes
Addon-manager
which is a controller like service that runs on master nodes of Tarmak. The
service is constantly watching for resources in Kubernetes with a label and
comparing them with local manifests inside a directory. If resources are changed
or removed from the local manifest set, the Addon-manager will then update them
in Kubernetes to keep it in sync.

This has been working really well for Tarmak deployments and has been handling
updates and migrations well. For example when you upgrade your cluster to 1.10
or higher, we now install
CoreDNS over
Kube-dns
which Addon-manager will replace. Addon-manager also helps to seamlessly
reconfigure Calico if it’s deployment has been changed in the Tarmak
configuration described earlier.

SSH Overhaul and Instance Public Key Advertising

With this release, we’ve also made some huge changes to how we are creating and
managing our SSH connections. This is one of the core components of Tarmak as it
enables connections to components such as wing – a small binary sitting on all
nodes to report it’s state and implement configuration updates – or creating
tunnels that allow initialisation and communication with vault as well as
accessing the Kubernetes API server when not using a public load balancer
endpoint. Previously, we had been using the OpenSSH client on your machine to
create and manage these connections however, has now been replaced with a custom
SSH client that uses the standard Go SSH library. What does this mean for
users? Connections should now be much more reliable and we can now use these
connections more efficiently. It has also enabled us to develop more
sophisticated features such as the log aggregation command mentioned earlier and
mitigate problems caused by inconsistencies between OpenSSH versions installed
on different machines.

With this change we have also updated the way we handle verifying instance’s
public keys that we SSH to along with managing the local SSH hosts file. Now
when an instance boots, wing will gather the public keys, sign its AWS identity
document with them and send them all to an Amazon Lambda function. Once the
function has verified these keys, it will tag that instance with them. Once an
instance has been tagged, they will not be changed. Locally Tarmak can use
these to populate the local hosts file and be used to verify SSH connections
to the instance. This change bolsters security for connecting to the
cluster.

Runway

Other features include improving the reliability of bootstrapping vault
instances, updates to components and some bug fixes. You can read more in the CHANGELOG or on the GitHub release page.

Give the release a go,
we look forward to hearing your feedback!

Source

KubeEdge, a Kubernetes Native Edge Computing Framework

KubeEdge becomes the first Kubernetes Native Edge Computing Platform with both Edge and Cloud components open sourced!

Open source edge computing is going through its most dynamic phase of development in the industry. So many open source platforms, so many consolidations and so many initiatives for standardization! This shows the strong drive to build better platforms to bring cloud computing to the edges to meet ever increasing demand. KubeEdge, which was announced last year, now brings great news for cloud native computing! It provides a complete edge computing solution based on Kubernetes with separate cloud and edge core modules. Currently, both the cloud and edge modules are open sourced.

Unlike certain light weight kubernetes platforms available around, KubeEdge is made to build edge computing solutions extending the cloud. The control plane resides in cloud, though scalable and extendable. At the same time, the edge can work in offline mode. Also it is lightweight and containerized, and can support heterogeneous hardware at the edge. With the optimization in edge resource utlization, KubeEdge positions to save significant setup and operation cost for edge solutions. This makes it the most compelling edge computing platform in the world currently, based on Kubernetes!

Kube(rnetes)Edge! – Opening up a new Kubernetes-based ecosystem for Edge Computing

The key goal for KubeEdge is extending Kubernetes ecosystem from cloud to edge. From the time it was announced to the public at KubeCon in Shanghai in November 2018, the architecture direction for KubeEdge was aligned to Kubernetes, as its name!

It started with its v0.1 providing the basic edge computing features. Now, with its latest release v0.2, it brings the cloud components to connect and complete the loop. With consistent and scalable Kubernetes-based interfaces, KubeEdge enables the orchestration and management of edge clusters similar to how Kubernetes manages in the cloud. This opens up seamless possibilities of bringing cloud computing capabilities to the edge, quickly and efficiently.

KubeEdge Links:

Based on its roadmap and architecture, KubeEdge tries to support all edge nodes, applications, devices and even the cluster management consistent with the Kuberenetes interface. This will help the edge cloud act exactly like a cloud cluster. This can save a lot of time and cost on the edge cloud development deployment based on KubeEdge.

KubeEdge provides a containerized edge computing platform, which is inherently scalable. As it’s modular and optimized, it is lightweight (66MB foot print and ~30MB running memory) and could be deployed on low resource devices. Similarly, the edge node can be of different hardware architecture and with different hardware configurations. For the device connectivity, it can support multiple protocols and it uses a standard MQTT-based communication. This helps in scaling the edge clusters with new nodes and devices efficiently.

You heard it right!

KubeEdge Cloud Core modules are open sourced!

By open sourcing both the edge and cloud modules, KubeEdge brings a complete cloud vendor agnostic lightweight heterogeneous edge computing platform. It is now ready to support building a complete Kubernetes ecosystem for edge computing, exploiting most of the existing cloud native projects or software modules. This can enable a mini-cloud at the edge to support demanding use cases like data analytics, video analytics, machine learning and more.

KubeEdge Architecture: Building Kuberenetes Native Edge computing!

The core architecture tenet for KubeEdge is to build interfaces that are consistent with Kubernetes, be it on the cloud side or edge side.

Edged: Manages containerized Applications at the Edge.

EdgeHub: Communication interface module at the Edge. It is a web socket client responsible for interacting with Cloud Service for edge computing.

CloudHub: Communication interface module at the Cloud. A web socket server responsible for watching changes on the cloud side, caching and sending messages to EdgeHub.

EdgeController: Manages the Edge nodes. It is an extended Kubernetes controller which manages edge nodes and pods metadata so that the data can be targeted to a specific edge node.

EventBus: Handles the internal edge communications using MQTT. It is an MQTT client to interact with MQTT servers (mosquitto), offering publish and subscribe capabilities to other components.

DeviceTwin: It is software mirror for devices that handles the device metadata. This module helps in handling device status and syncing the same to cloud. It also provides query interfaces for applications, as it interfaces to a lightweight database (SQLite).

MetaManager: It manages the metadata at the edge node. This is the message processor between edged and edgehub. It is also responsible for storing/retrieving metadata to/from a lightweight database (SQLite).

Even if you want to add more control plane modules based on the architecture refinement and improvement (for example enhanced security), it is simple as it uses consistent registration and modular communication within these modules.

KubeEdge provides scalable lightweight Kubernetes Native Edge Computing Platform which can work in offline mode.

It helps simplify edge application development and deployment.

Cloud vendor agnostic and can run the cloud core modules on any compute node.

Release 0.1 to 0.2 – game changer!

KubeEdge v0.1 was released at the end of December 2018 with very basic edge features to manage edge applications along with Kubernetes API primitives for node, pod, config etc. In ~2 months, KubeEdge v0.2 was release on March 5th, 2019. This release provides the cloud core modules and enables the end to end open source edge computing solution. The cloud core modules can be deployed to any compute node from any cloud vendors or on-prem.

Now, the complete edge solution can be installed and tested very easily, also with a laptop.

Run Anywhere – Simple and Light

As described, the KubeEdge Edge and Cloud core components can be deployed easily and can run the user applications. The edge core has a foot print of 66MB and just needs 30MB memory to run. Similarly the cloud core can run on any cloud nodes. (User can experience by running it on a laptop as well)

The installation is simple and can be done in few steps:

  1. Setup the pre-requisites Docker, Kubernetes, MQTT and openssl
  2. Clone and Build KubeEdge Cloud and Edge
  3. Run Cloud
  4. Run Edge

The detailed steps for each are available at KubeEdge/kubeedge

Future: Taking off with competent features and community collaboration

KubeEdge has been developed by members from the community who are active contributors to Kubernetes/CNCF and doing research in edge computing. The KubeEdge team is also actively collaborating with Kubernetes IOT/EDGE WORKING GROUP. Within a few months of the KubeEdge announcement it has attracted members from different organizations including JingDong, Zhejiang University, SEL Lab, Eclipse, China Mobile, ARM, Intel to collaborate in building the platform and ecosystem.

KubeEdge has a clear roadmap for its upcoming major releases in 2019. vc1.0 targets to provide a complete edge cluster and device management solution with standard edge to edge communication, while v2.0 targets to have advanced features like service mesh, function service , data analytics etc at edge. Also, for all the features, KubeEdge architecture would attempt to utilize the existing CNCF projects/software.

The KubeEdge community needs varied organizations, their requirements, use cases and support to build it. Please join to make a kubernetes native edge computing platform which can extend the cloud native computing paradigm to edge cloud.

How to Get Involved?

We welcome more collaboration to build the Kubernetes native edge computing ecosystem. Please join us!

Source

Kubernetes Setup Using Ansible and Vagrant

Objective

This blog post describes the steps required to setup a multi node Kubernetes cluster for development purposes. This setup provides a production-like cluster that can be setup on your local machine.

Why do we require multi node cluster setup?

Multi node Kubernetes clusters offer a production-like environment which has various advantages. Even though Minikube provides an excellent platform for getting started, it doesn’t provide the opportunity to work with multi node clusters which can help solve problems or bugs that are related to application design and architecture. For instance, Ops can reproduce an issue in a multi node cluster environment, Testers can deploy multiple versions of an application for executing test cases and verifying changes. These benefits enable teams to resolve issues faster which make the more agile.

Why use Vagrant and Ansible?

Vagrant is a tool that will allow us to create a virtual environment easily and it eliminates pitfalls that cause the works-on-my-machine phenomenon. It can be used with multiple providers such as Oracle VirtualBox, VMware, Docker, and so on. It allows us to create a disposable environment by making use of configuration files.

Ansible is an infrastructure automation engine that automates software configuration management. It is agentless and allows us to use SSH keys for connecting to remote machines. Ansible playbooks are written in yaml and offer inventory management in simple text files.

Prerequisites

  • Vagrant should be installed on your machine. Installation binaries can be found here.
  • Oracle VirtualBox can be used as a Vagrant provider or make use of similar providers as described in Vagrant’s official documentation.
  • Ansible should be installed in your machine. Refer to the Ansible installation guide for platform specific installation.

Setup overview

We will be setting up a Kubernetes cluster that will consist of one master and two worker nodes. All the nodes will run Ubuntu Xenial 64-bit OS and Ansible playbooks will be used for provisioning.

Step 1: Creating a Vagrantfile

Use the text editor of your choice and create a file with named Vagrantfile, inserting the code below. The value of N denotes the number of nodes present in the cluster, it can be modified accordingly. In the below example, we are setting the value of N as 2.

IMAGE_NAME = "bento/ubuntu-16.04"
N = 2

Vagrant.configure("2") do |config|
    config.ssh.insert_key = false

    config.vm.provider "virtualbox" do |v|
        v.memory = 1024
        v.cpus = 2
    end
      
    config.vm.define "k8s-master" do |master|
        master.vm.box = IMAGE_NAME
        master.vm.network "private_network", ip: "192.168.50.10"
        master.vm.hostname = "k8s-master"
        master.vm.provision "ansible" do |ansible|
            ansible.playbook = "kubernetes-setup/master-playbook.yml"
        end
    end

    (1..N).each do |i|
        config.vm.define "node-#{i}" do |node|
            node.vm.box = IMAGE_NAME
            node.vm.network "private_network", ip: "192.168.50.#{i + 10}"
            node.vm.hostname = "node-#{i}"
            node.vm.provision "ansible" do |ansible|
                ansible.playbook = "kubernetes-setup/node-playbook.yml"
            end
        end
    end

Step 2: Create an Ansible playbook for Kubernetes master.

Create a directory named kubernetes-setup in the same directory as the Vagrantfile. Create two files named master-playbook.yml and node-playbook.yml in the directory kubernetes-setup.

In the file master-playbook.yml, add the code below.

Step 2.1: Install Docker and its dependent components.

We will be installing the following packages, and then adding a user named “vagrant” to the “docker” group. – docker-ce – docker-ce-cli – containerd.io

---
- hosts: all
  become: true
  tasks:
  - name: Install packages that allow apt to be used over HTTPS
    apt:
      name: "{{ packages }}"
      state: present
      update_cache: yes
    vars:
      packages:
      - apt-transport-https
      - ca-certificates
      - curl
      - gnupg-agent
      - software-properties-common

  - name: Add an apt signing key for Docker
    apt_key:
      url: https://download.docker.com/linux/ubuntu/gpg
      state: present

  - name: Add apt repository for stable version
    apt_repository:
      repo: deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable
      state: present

  - name: Install docker and its dependecies
    apt: 
      name: "{{ packages }}"
      state: present
      update_cache: yes
    vars:
      packages:
      - docker-ce 
      - docker-ce-cli 
      - containerd.io
    notify:
      - docker status

  - name: Add vagrant user to docker group
    user:
      name: vagrant
      group: docker

Step 2.2: Kubelet will not start if the system has swap enabled, so we are disabling swap using the below code.

  - name: Remove swapfile from /etc/fstab
    mount:
      name: "{{ item }}"
      fstype: swap
      state: absent
    with_items:
      - swap
      - none

  - name: Disable swap
    command: swapoff -a
    when: ansible_swaptotal_mb > 0

Step 2.3: Installing kubelet, kubeadm and kubectl using the below code.

  - name: Add an apt signing key for Kubernetes
    apt_key:
      url: https://packages.cloud.google.com/apt/doc/apt-key.gpg
      state: present

  - name: Adding apt repository for Kubernetes
    apt_repository:
      repo: deb https://apt.kubernetes.io/ kubernetes-xenial main
      state: present
      filename: kubernetes.list

  - name: Install Kubernetes binaries
    apt: 
      name: "{{ packages }}"
      state: present
      update_cache: yes
    vars:
      packages:
        - kubelet 
        - kubeadm 
        - kubectl

Step 2.3: Initialize the Kubernetes cluster with kubeadm using the below code (applicable only on master node).

  - name: Initialize the Kubernetes cluster using kubeadm
    command: kubeadm init --apiserver-advertise-address="192.168.50.10" --apiserver-cert-extra-sans="192.168.50.10"  --node-name k8s-master --pod-network-cidr=192.168.0.0/16

Step 2.4: Setup the kube config file for the vagrant user to access the Kubernetes cluster using the below code.

  - name: Setup kubeconfig for vagrant user
    command: "{{ item }}"
    with_items:
     - mkdir -p /home/vagrant/.kube
     - cp -i /etc/kubernetes/admin.conf /home/vagrant/.kube/config
     - chown vagrant:vagrant /home/vagrant/.kube/config

Step 2.5: Setup the container networking provider and the network policy engine using the below code.

  - name: Install calico pod network
    become: false
    command: kubectl create -f https://docs.projectcalico.org/v3.4/getting-started/kubernetes/installation/hosted/calico.yaml

Step 2.6: Generate kube join command for joining the node to the Kubernetes cluster and store the command in the file named join-command.

  - name: Generate join command
    command: kubeadm token create --print-join-command
    register: join_command

  - name: Copy join command to local file
    local_action: copy content="{{ join_command.stdout_lines[0] }}" dest="./join-command"

Step 2.7: Setup a handler for checking Docker daemon using the below code.

  handlers:
    - name: docker status
      service: name=docker state=started

Step 3: Create the Ansible playbook for Kubernetes node.

Create a file named node-playbook.yml in the directory kubernetes-setup.

Add the code below into node-playbook.yml

Step 3.1: Start adding the code from Steps 2.1 till 2.3.

Step 3.2: Join the nodes to the Kubernetes cluster using below code.

  - name: Copy the join command to server location
    copy: src=join-command dest=/tmp/join-command.sh mode=0777

  - name: Join the node to cluster
    command: sh /tmp/join-command.sh

Step 3.3: Add the code from step 2.7 to finish this playbook.

Step 4: Upon completing the Vagrantfile and playbooks follow the below steps.

$ cd /path/to/Vagrantfile
$ vagrant up

Upon completion of all the above steps, the Kubernetes cluster should be up and running. We can login to the master or worker nodes using Vagrant as follows:

$ ## Accessing master
$ vagrant ssh k8s-master
vagrant@k8s-master:~$ kubectl get nodes
NAME         STATUS   ROLES    AGE     VERSION
k8s-master   Ready    master   18m     v1.13.3
node-1       Ready    <none>   12m     v1.13.3
node-2       Ready    <none>   6m22s   v1.13.3

$ ## Accessing nodes
$ vagrant ssh node-1
$ vagrant ssh node-2

Source

What Is Jenkins X and How It Differs from Jenkins and CloudBees Core

An End to the Confusion: Jenkins or Jenkins X

Jenkins has served as a continuous integration (CI) tool long before the emergence of Kubernetes and distributed systems running on cloud native platforms. Working with Jenkins as a stand-alone open source tool — as developers and operations folks will be the first to say — can also be extremely difficult.

Now, more recently, the shift to cloud native and Kubernetes poses even more Jenkins management-specific challenges for organizations. As a result, Jenkins X has emerged as a way to both improve and automate continuous delivery pipelines to Kubernetes and cloud native environments. For many, however, there is a concern about compatibility between Jenkins and Jenkins X pipelines. And the role of each is also a worry expressed in forum posts and comments on Reddit and other outlets.

Indeed, understanding the role of each, the changing landscape of Jenkins, as well as the role CloudBees — the commercial distribution of Jenkins — plays, as enterprises continue to work with pipelines for on-premise, cloud native or a combination of both types of deployments, has also served as a source of confusion.

The short answer is that Jenkins X is indeed geared for Kubernetes deployments — but that does not mean it necessarily needs to replace existing Jenkins configurations, at least in the immediate. Nor does that means Jenkins X is unable to manage Jenkins pipelines for on-premise or non-cloud production pipelines. Meanwhile, CloudBees is applicable for both. Hence lies the source of difficulty in understanding the roles each can play.

“It’s quite a leap for the people to understand — basically, we see lots of confusion. Plus a lot of things are changing and we see [even more] confusion,” James Strachan, senior architect at CloudBees, the project lead on Jenkins X, said. “ And inside CloudBees and with our customers there’s a lot of misunderstanding out there. I hope we can make this a little bit clearer. I think it’s supposed to be shown, for example, you can use Jenkins X to orchestrate Jenkins.”

In many ways, confusion exists “almost every time you rename well-known software projects,” Torsten Volk, an analyst for Enterprise Management Associates (EMA), said. “Remember the outcry when Docker renamed the community version of its software to Moby? Here the issue was that CloudBees first came up with the Jenkins Enterprise and CloudBees Suite product names, where Jenkins Enterprise already started supporting containers,” Volk said. “Then wrapping all this mess into the CloudBees Core offering was always going to be hard, but had to be done.”

The idea now is to blend the classic Jenkins world and the Jenkins X world into one experience “so that it really doesn’t matter” if an organization, for example, were to combine Jenkins servers for serverless and automated pipelines and, for example, on-premise deployments with a single user interface (UI) and command line interface (CLI),  Strachan said. “Both of these use cases are solved since some teams are using both, right?” Strachan said. “Some teams are building new microservices but they still have that classic VM.”

Strachan did not come out and describe Jenkins X as “Jenkins 2.0.” However, the hope is that Jenkins X, following its relatively recent release, will eventually cover the bases of all Jenkins pipelines, both classic and for Kubernetes. “I think, we’ve not really gotten the message out to the community quite well enough: Jenkins X is really how everyone will use Jenkins at some point,” Strachan said.

The Use Cases

On the surface at least, arguments to adopt Jenkins X for production pipelines for Kubernetes makes obvious sense. As the Jenkins X authors define it:

“Jenkins X is a CI/CD [continuous delivery] solution for modern cloud applications on Kubernetes … Jenkins X provides pipeline automation, built-in GitOps and preview environments to help teams collaborate and accelerate their software delivery at any scale.”

For those starting a new greenfield project “and you have no CI, you just go straight to Jenkins X and that’s easy — you just automate all that stuff and you’re done,” Strachan said.

But outside of the usage sphere of cloud native-only pipelines, things can get murky — at least from the outset. It is still not readily apparent in the minds of many, for example, that Jenkins X is also designed, as mentioned before, to serve those organizations with on-premise production pipeline-only needs — whether they eventually decide to deploy to cloud native platforms or not.

Most of the organizations Strachan said he has spoken with already have Jenkins set up for pipelines and want to extend them into Jenkins X. “You could just go and say “hey, let’s just try Jenkins X to automate all of our pipelines,” Strachan said. “And it might be better for you… But really long term, we want to get rid of those Jenkins files you wrote by hand because we think our pipelines can do better.”

During the months that have followed the final production release of Jenkins X, the tools are seen as doing a good job of merging GitOps concepts and leveraging Kubernetes as the orchestration and target platform, Ravi Lachhman, a technical evangelist for AppDynamics, said. “By leveraging Kubernetes as the platform to run Jenkins X on, the CI/CD solution can be much more robust in resource placement e.g. worker nodes and availability.”

At the end of the day, “no matter what the open source platform is, each organization is different when considering build vs buy,” Lachhman said. “CloudBees provides a lot of expertise in their domain/stack and offering capabilities in scalability, security and support. For teams that are versed in GitOps, Jenkins X is a modern prescription for running a CI/CD stack on Kubernetes,”  Lachhman said. “The Jenkins family continues to evolve as the backbone of many DevOps pipelines and continues to evolve with the push into cloud native architecture.”

“CloudBees is a commercial open source software company, which writes the vast majority of Jenkins code. These sorts of structures are by no means uncommon in commercial open source,” James Governor, an analyst for and co-founder of Redmonk, said. “Jenkins is a more well-known brand than CloudBees, but again that is very common in these situations, where open source distribution gets far ahead of commercial customer acquisition.”

Strategy Revealed

CloudBees has drawn $112 million in capital since 2010 and therefore needs to reach terminal velocity sometime soon, Volk said. “They managed to sign up hundreds of paying customers for their various Jenkins support offerings, but at the same time they need to pay great talent like James Strachan who is one of the main contributors across the Jenkins and Jenkins X projects,” Volk said. “And, of course they need to support lots and lots of Jenkins integrations with every piece of infrastructure and middleware under the sun.”

The Jenkins X project was also necessary to not get behind the competitions, such as Xebia Labs, Atlassian and Electric Cloud in terms of Kubernetes support, Volk said.

“That’s why they decided to roll Jenkins, Jenkins X, Codeship, and their DevOps analytics product into one DevOps management platform that spans the whole enterprise,” Volk said. “What they now need to do is evangelize the heck out of CloudBees Core and explaining their involvement in Jenkins and Jenkins X — this is difficult.”

Source

Announcing Submariner, Multi-Cluster Network Connectivity for Kubernetes

Today we are proud to announce Submariner, a new open-source project enabling network connectivity between Kubernetes clusters. We launched the project to provide network connectivity for microservices deployed in multiple Kubernetes clusters that need to communicate with each other. This new solution overcomes barriers to connectivity between Kubernetes clusters and allows for a host of new multi-cluster implementations, such as database replication within Kubernetes across geographic regions and deploying service mesh across clusters.

Organizations are looking for Kubernetes as the standard computing platform across all public and private cloud infrastructure. Submariner allows these organizations to seamlessly connect, scale, and migrate workloads across Kubernetes clusters deployed on any cloud.

Network Connectivity Across Clusters with Submariner

Historically, Kubernetes deployments implement network virtualization, enabling containers to run on multiple nodes within the same cluster to communicate with each other. However, containers running in different Kubernetes clusters must communicate with each other through ingress controllers or node ports. Submariner now creates the necessary tunnels and routes needed to enable containers in different Kubernetes clusters to connect directly. Key features of Submariner include:

  • Compatibility and connectivity with existing clusters: Users can deploy Submariner into existing Kubernetes clusters, with the addition of Layer-3 network connectivity between pods in different clusters.
  • Secure paths: Encrypted network connectivity is implemented using IPSec tunnels.
  • Various connectivity mechanisms: While IPsec is the default connectivity mechanism out of the box, Rancher will enable different interconnectivity plugins in the near future.
  • Centralized broker : Users can register and maintain a set of healthy gateway nodes.
  • Flexible service discovery: Submariner provides service discovery across multiple Kubernetes clusters.
  • CNI compatibility: Works with popular CNI drivers such as Flannel and Calico.

Developers who are interested in downloading, installing and playing with this new networking solution should visit https://submariner.io or follow the project on https://github.com/rancher/submariner. Enterprises who need assistance in deploying and managing Submariner can contact info@rancher.com.

Source

Considerations When Designing Distributed Systems

Today’s applications are marvels of distributed systems development. Each function or service that makes up an application may be executing on a different system, based upon a different system architecture, that is housed in a different geographical location, and written in a different computer language. Components of today’s applications might be hosted on a powerful system carried in the owner’s pocket and communicating with application components or services that are replicated in data centers all over the world.

What’s amazing about this, is that individuals using these applications typically are not aware of the complex environment that responds to their request for the local time, local weather, or for directions to their hotel.

Let’s pull back the curtain and look at the industrial sorcery that makes this all possible and contemplate the thoughts and guidelines developers should keep in mind when working with this complexity.

The Evolution of System Design

Figure 1: Evolution of system design over time

Figure 1: Evolution of system design over time

Source: Interaction Design Foundation, The Social Design of Technical Systems: Building technologies for communities

Application development has come a long way from the time that programmers wrote out applications, hand compiled them into the language of the machine they were using, and then entered individual machine instructions and data directly into the computer’s memory using toggle switches.

As processors became more and more powerful, system memory and online storage capacity increased, and computer networking capability dramatically increased, approaches to development also changed. Data can now be transmitted from one side of the planet to the other faster than it used to be possible for early machines to move data from system memory into the processor itself!

Let’s look at a few highlights of this amazing transformation.

Monolithic Design

Early computer programs were based upon a monolithic design with all of the application components were architected to execute on a single machine. This meant that functions such as the user interface (if users were actually able to interact with the program), application rules processing, data management, storage management, and network management (if the computer was connected to a computer network) were all contained within the program.

While simpler to write, these programs become increasingly complex, difficult to document, and hard to update or change. At this time, the machines themselves represented the biggest cost to the enterprise and so applications were designed to make the best possible use of the machines.

Client/Server Architecture

As processors became more powerful, system and online storage capacity increased, and data communications became faster and more cost-efficient, application design evolved to match pace. Application logic was refactored or decomposed, allowing each to execute on different machines and the ever-improving networking was inserted between the components. This allowed some functions to migrate to the lowest cost computing environment available at the time. The evolution flowed through the following stages:

Terminals and Terminal Emulation

Early distributed computing relied on special-purpose user access devices called terminals. Applications had to understand the communications protocols they used and issue commands directly to the devices. When inexpensive personal computing (PC) devices emerged, the terminals were replaced by PCs running a terminal emulation program.

At this point, all of the components of the application were still hosted on a single mainframe or minicomputer.

Light Client

As PCs became more powerful, supported larger internal and online storage, and network performance increased, enterprises segmented or factored their applications so that the user interface was extracted and executed on a local PC. The rest of the application continued to execute on a system in the data center.

Often these PCs were less costly than the terminals that they replaced. They also offered additional benefits. These PCs were multi-functional devices. They could run office productivity applications that weren’t available on the terminals they replaced. This combination drove enterprises to move to client/server application architectures when they updated or refreshed their applications.

Midrange Client

PC evolution continued at a rapid pace. Once more powerful systems with larger storage capacities were available, enterprises took advantage of them by moving even more processing away from the expensive systems in the data center out to the inexpensive systems on users’ desks. At this point, the user interface and some of the computing tasks were migrated to the local PC.

This allowed the mainframes and minicomputers (now called servers) to have a longer useful life, thus lowering the overall cost of computing for the enterprise.

Heavy client

As PCs become more and more powerful, more application functions were migrated from the backend servers. At this point, everything but data and storage management functions had been migrated.

Enter the Internet and the World Wide Web

The public internet and the World Wide Web emerged at this time. Client/server computing continued to be used. In an attempt to lower overall costs, some enterprises began to re-architect their distributed applications so they could use standard internet protocols to communicate and substituted a web browser for the custom user interface function. Later, some of the application functions were rewritten in Javascript so that they could execute locally on the client’s computer.

Server Improvements

Industry innovation wasn’t focused solely on the user side of the communications link. A great deal of improvement was made to the servers as well. Enterprises began to harness together the power of many smaller, less expensive industry standard servers to support some or all of their mainframe-based functions. This allowed them to reduce the number of expensive mainframe systems they deployed.

Soon, remote PCs were communicating with a number of servers, each supporting their own component of the application. Special-purpose database and file servers were adopted into the environment. Later, other application functions were migrated into application servers.

Networking was another area of intense industry focus. Enterprises began using special-purpose networking servers that provided fire walls and other security functions, file caching functions to accelerate data access for their applications, email servers, web servers, web application servers, distributed name servers that kept track of and controlled user credentials for data and application access. The list of networking services that has been encapsulated in an appliance server grows all the time.

Object-Oriented Development

The rapid change in PC and server capabilities combined with the dramatic price reduction for processing power, memory and networking had a significant impact on application development. No longer where hardware and software the biggest IT costs. The largest costs were communications, IT services (the staff), power, and cooling.

Software development, maintenance, and IT operations took on a new importance and the development process was changed to reflect the new reality that systems were cheap and people, communications, and power were increasingly expensive.

Figure 2: Worldwide IT spending forcast

Figure 2: Worldwide IT spending forcast

Source: Gartner Worldwide IT Spending Forecast, Q1 2018

Enterprises looked to improved data and application architectures as a way to make the best use of their staff. Object-oriented applications and development approaches were the result. Many programming languages such as the following supported this approach:

  • C++
  • C#
  • COBOL
  • Java
  • PHP
  • Python
  • Ruby

Application developers were forced to adapt by becoming more systematic when defining and documenting data structures. This approach also made maintaining and enhancing applications easier.

Open-Source Software

Opensource.com offers the following definition for open-source software: “Open source software is software with source code that anyone can inspect, modify, and enhance.” It goes on to say that, “some software has source code that only the person, team, or organization who created it — and maintains exclusive control over it — can modify. People call this kind of software ‘proprietary’ or ‘closed source’ software.”

Only the original authors of proprietary software can legally copy, inspect, and alter that software. And in order to use proprietary software, computer users must agree (often by accepting a license displayed the first time they run this software) that they will not do anything with the software that the software’s authors have not expressly permitted. Microsoft Office and Adobe Photoshop are examples of proprietary software.

Although open-source software has been around since the very early days of computing, it came to the forefront in the 1990s when complete open-source operating systems, virtualization technology, development tools, database engines, and other important functions became available. Open-source technology is often a critical component of web-based and distributed computing. Among others, the open-source offerings in the following categories are popular today:

  • Development tools
  • Application support
  • Databases (flat file, SQL, No-SQL, and in-memory)
  • Distributed file systems
  • Message passing/queueing
  • Operating systems
  • Clustering

Distributed Computing

The combination of powerful systems, fast networks, and the availability of sophisticated software has driven major application development away from monolithic towards more highly distributed approaches. Enterprises have learned, however, that sometimes it is better to start over than to try to refactor or decompose an older application.

When enterprises undertake the effort to create distributed applications, they often discover a few pleasant side effects. A properly designed application, that has been decomposed into separate functions or services, can be developed by separate teams in parallel.

Rapid application development and deployment, also known as DevOps, emerged as a way to take advantage of the new environment.

Service-Oriented Architectures

As the industry evolved beyond client/server computing models to an even more distributed approach, the phrase “service-oriented architecture” emerged. This approach was built on distributed systems concepts, standards in message queuing and delivery, and XML messaging as a standard approach to sharing data and data definitions.

Individual application functions are repackaged as network-oriented services that receive a message requesting they perform a specific service, they perform that service, and then the response is sent back to the function that requested the service.

This approach offers another benefit, the ability for a given service to be hosted in multiple places around the network. This offers both improved overall performance and improved reliability.

Workload management tools were developed that receive requests for a service, review the available capacity, forward the request to the service with the most available capacity, and then send the response back to the requester. If a specific service doesn’t respond in a timely fashion, the workload manager simply forwards the request to another instance of the service. It would also mark the service that didn’t respond as failed and wouldn’t send additional requests to it until it received a message indicating that it was still alive and healthy.

What Are the Considerations for Distributed Systems

Now that we’ve walked through over 50 years of computing history, let’s consider some rules of thumb for developers of distributed systems. There’s a lot to think about because a distributed solution is likely to have components or services executing in many places, on different types of systems, and messages must be passed back and forth to perform work. Care and consideration are absolute requirements to be successful creating these solutions. Expertise must also be available for each type of host systme, development tool, and messaging system in use.

Nailing Down What Needs to Be Done

One of the first things to consider is what needs to be accomplished! While this sounds simple, it’s incredibly important.

It’s amazing how many developers start building things before they know, in detail, what is needed. Often, this means that they build unnecessary functions and waste their time. To quote Yogi Berra, “if you don’t know where you are going, you’ll end up someplace else.”

A good place to start is knowing what needs to be done, what tools and services are already available, and what people using the final solution should see.

Interactive Versus Batch

Since fast responses and low latency are often requirements, it would be wise to consider what should be done while the user is waiting and what can be put into a batch process that executes on an event-driven or time-driven schedule.

After the initial segmentation of functions has been considered, it is wise to plan when background, batch processes need to execute, what data do these functions manipulate, and how to make sure these functions are reliable, are available when needed, and how to prevent the loss of data.

Where Should Functions Be Hosted?

Only after the “what” has been planned in fine detail, should the “where” and “how” be considered. Developers have their favorite tools and approaches and often will invoke them even if they might not be the best choice. As Bernard Baruch was reported to say, “if all you have is a hammer, everything looks like a nail.”

It is also important to be aware of corporate standards for enterprise development. It isn’t wise to select a tool simply because it is popular at the moment. That tool just might do the job, but remember that everything that is built must be maintained. If you build something that only you can understand or maintain, you may just have tied yourself to that function for the rest of your career. I have personally created functions that worked properly and were small and reliable. I received telephone calls regarding these for ten years after I left that company because later developers could not understand how the functions were implemented. The documentation I wrote had been lost long earlier.

Each function or service should be considered separately in a distributed solution. Should the function be executed in an enterprise data center, in the data center of a cloud services provider or, perhaps, in both. Consider that there are regulatory requirements in some industries that direct the selection of where and how data must be maintained and stored.

Other considerations include:

  • What type of system should be the host of that function. Is one system architecture better for that function? Should the system be based upon ARM, X86, SPARC, Precision, Power, or even be a Mainframe?
  • Does a specific operating system provide a better computing environment for this function? Would Linux, Windows, UNIX, System I, or even System Z be a better platform?
  • Is a specific development language better for that function? Is a specific type of data management tool? Is a Flat File, SQL database, No-SQL database, or a non-structured storage mechanism better?
  • Should the function be hosted in a virtual machine or a container to facilitate function mobility, automation and orchestration?

Virtual machines executing Windows or Linux were frequently the choice in the early 2000s. While they offered significant isolation for functions and made it easily possible to restart or move them when necessary, their processing, memory and storage requirements were rather high. Containers, another approach to processing virtualization, are the emerging choice today because they offer similar levels of isolation, the ability to restart and migrate functions and consume far less processing power, memory or storage.

Performance

Performance is another critical consideration. While defining the functions or services that make up a solution, the developers should be aware if they have significant processing, memory or storage requirements. It might be wise to look at these functions closely to learn if that can be further subdivided or decomposed.

Further segmentation would allow an increase in parallelization which would potentially offer performance improvements. The trade off, of course, is that this approach also increases complexity and, potentially, makes them harder to manage and to make secure.

Reliability

In high stakes enterprise environments, solution reliability is essential. The developer must consider when it is acceptable to force people to re-enter data, re-run a function, or when a function can be unavailable.

Database developers ran into this issue in the 1960s and developed the concept of an atomic function. That is, the function must complete or the partial updates must be rolled back leaving the data in the state it was in before the function began. This same mindset must be applied to distributed systems to ensure that data integrity is maintained even in the event of service failures and transaction disruptions.

Functions must be designed to totally complete or roll back intermediate updates. In critical message passing systems, messages must be stored until an acknowledgement that a message has been received comes in. If such a message isn’t received, the original message must be resent and a failure must be reported to the management system.

Manageability

Although not as much fun to consider as the core application functionality, manageability is a key factor in the ongoing success of the application. All distributed functions must be fully instrumented to allow administrators to both understand the current state of each function and to change function parameters if needed. Distributed systems, after all, are constructed of many more moving parts than the monolithic systems they replace. Developers must be constantly aware of making this distributed computing environment easy to use and maintain.

This brings us to the absolute requirement that all distributed functions must be fully instrumented to allow administrators to understand their current state. After all, distributed systems are inherently more complex and have more moving parts than the monolithic systems they replace.

Security

Distributed system security is an order of magnitude more difficult than security in a monolithic environment. Each function must be made secure separately and the communication links between and among the functions must also be made secure. As the network grows in size and complexity, developers must consider how to control access to functions, how to make sure than only authorized users can access these function, and to to isolate services from one other.

Security is a critical element that must be built into every function, not added on later. Unauthorized access to functions and data must be prevented and reported.

Privacy

Privacy is the subject of an increasing number of regulations around the world. Examples like the European Union’s GDPR and the U.S. HIPPA regulations are important considerations for any developer of customer-facing systems.

Mastering Complexity

Developers must take the time to consider how all of the pieces of a complex computing environment fit together. It is hard to maintain the discipline that a service should encapsulate a single function or, perhaps, a small number of tightly interrelated functions. If a given function is implemented in multiple places, maintaining and updating that function can be hard. What would happen when one instance of a function doesn’t get updated? Finding that error can be very challenging.

This means it is wise for developers of complex applications to maintain a visual model that shows where each function lives so it can be updated if regulations or business requirements change.

Often this means that developers must take the time to document what they did, when changes were made, as well as what the changes were meant to accomplish so that other developers aren’t forced to decipher mounds of text to learn where a function is or how it works.

To be successful as a architect of distributed systems, a developer must be able to master complexity.

Approaches Developers Must Master

Developers must master decomposing and refactoring application architectures, thinking in terms of teams, and growing their skill in approaches to rapid application development and deployment (DevOps). After all, they must be able to think systematically about what functions are independent of one another and what functions rely on the output of other functions to work. Functions that rely upon one other may be best implemented as a single service. Implementing them as independent functions might create unnecessary complexity and result in poor application performance and impose an unnecessary burden on the network.

Virtualization Technology Covers Many Bases

Virtualization is a far bigger category than just virtual machine software or containers. Both of these functions are considered processing virtualization technology. There are at least seven different types of virtualization technology in use in modern applications today. Virtualization technology is available to enhance how users access applications, where and how applications execute, where and how processing happens, how networking functions, where and how data is stored, how security is implemented, and how management functions are accomplished. The following model of virtualization technology might be helpful to developers when they are trying to get their arms around the concept of virtualization:

Figure 3: Architure of virtualized systems

Figure 3: Architure of virtualized systems

Source: 7 Layer Virtualizaiton Model, VirtualizationReview.com

Think of Software-Defined Solutions

It is also important for developers to think in terms of “software defined” solutions. That is, to segment the control from the actual processing so that functions can be automated and orchestrated.

Developers shouldn’t feel like they are on their own when wading into this complex world. Suppliers and open-source communities offer a number of powerful tools. Various forms of virtualization technology can be a developer’s best friend.

Virtualization Technology Can Be Your Best Friend

  • Containers make it possible to easily develop functions that can execute without interfering with one another and can be migrated from system to system based upon workload demands.
  • Orchestration technology makes it possible to control many functions to ensure they are performing well and are reliable. It can also restart or move them in a failure scenario.
  • Supports incremental development: functions can be developed in parallel and deployed as they are ready. They also can be updated with new features without requiring changes elsewhere.
  • Supports highly distributed systems: functions can be deployed locally in the enterprise data center or remotely in the data center of a cloud services provider.

Think In Terms of Services

This means that developers must think in terms of services and how services can communicate with one another.

Well-Defined APIs

Well defined APIs mean that multiple teams can work simultaneously and still know that everything will fit together as planned. This typically means a bit more work up front, but it is well worth it in the end. Why? Because overall development can be faster. It also makes documentation easier.

Support Rapid Application Development

This approach is also perfect for rapid application development and rapid prototyping, also known as DevOps. Properly executed, DevOps also produces rapid time to deployment.

Think In Terms of Standards

Rather than relying on a single vendor, the developer of distributed systems would be wise to think in terms of multi-vendor, international standards. This approach avoids vendor lock-in and makes finding expertise much easier.

Summary

It’s interesting to note how guidelines for rapid application development and deployment of distributed systems start with “take your time.” It is wise to plan out where you are going and what you are going to do otherwise you are likely to end up somewhere else, having burned through your development budget, and have little to show for it.

Source

Microservices vs. Monolithic Architectures

Enterprises are increasingly pressured by competitors and their own customers to get applications working and online quicker while also minimizing development costs. These divergent goals have forced enterprise IT organization to evolve rapidly. After undergoing one forced evolution after another since the 1960s, many are prepared to take the step away from monolithic application architectures to embrace the microservices approach.

Figure 1: Architecture differences between traditional monolithic applications and microservices

Figure 1: Architecture differences between traditional monolithic applications and microservices

Higher Expectations and More Empowered Customers

Customers that are used to having worldwide access to products and services now expect enterprises to quickly respond to whatever other suppliers are doing.

CIO magazine, in reporting upon Ovum’s research, pointed out:

“Customers now have the upper hand in the customer journey. With more ways to shop and less time to do it, they don’t just gather information and complete transactions quickly. They often want to get it done on the go, preferably on a mobile device, without having to engage in drawn-out conversations.”

IT Under Pressure

This intense worldwide competition also forces enterprises to find new ways to cut costs or find new ways to be more efficient. Developers have seen this all before. This is just the newest iteration of the perennial call to “do more with less” that enterprise IT has faced for more than a decade. Even though IT budgets grow, they’ve learned, the investments are often in new IT services or better communications.

Figure 2: Forcasted 2018 worldwide IT spending growth

Figure 2: Forcasted 2018 worldwide IT spending growth

Source: Gartner Market Databook, 4Q17

As enterprise IT organizations face pressure to respond, they have had to revisit their development processes. The traditional two-year development cycle, previously acceptable, is no longer satisfactory. There is simply no time for that now.

A Confluence of Trends

Enterprise IT has also been forced to respond to a confluence of trends that are divergent and contradictory.

  • The introduction of inexpensive but high-performance network connectivity that allows distributed functions to communicate with one another across the network as fast as processes previously could communicate with one another inside of a single system.
  • The introduction of powerful microprocessors that offer mainframe-class performance in inexpensive and small packages. After standardizing on the X86 microprocessor architecture, enterprises are now being forced to consider other architectures to address their need for higher performance, lower cost, and both lower power consumption and heat production.
  • Internal system memory capacity continues to increase making it possible to deploy large-scale applications or application components in small systems.
  • External storage use is evolving away from the use of rotating media to solid state devices to increase capability, reduce latency, decrease overall cost, and deliver enormous capacity.
  • The evolution of open-source software and distributed computing functions make it possible for the enterprise to inexpensively add a herd of systems when new capabilities are needed rather than facing an expensive and time-consuming forklift upgrade to expand a central host system.
  • Customers demand instant and easy access to applications and data.

As enterprises address these trends, they soon discover that the approach that they had been relying on — focusing on making the best use of expensive systems and networks — needs to change. The most significant costs are now staffing, power, and cooling. This is in addition to the evolution they made nearly two decades ago when their focus shifted from monolithic mainframe computing to distributed, X86-based midrange systems.

The Next Steps in a Continuing Saga

Here’s what enterprise IT has done to respond to all of these trends.

They are choosing to move from using the traditional waterfall development approach to various forms of rapid application development. They also are moving away from compiled languages to interpreted or incrementally compiled languages such as Java, Python, or Ruby to improve developer productivity.

IDC, for example, predicts that:

“By 2021 65% of CIOs will expand agile/DevOps practices into the wider business to achieve the velocity necessary for innovation, execution, and change.”

Complex applications are increasingly designed as independent functions or “services” that can be hosted in several places on the network to improve both performance and application reliability. This approach means that it is possible to address changing business requirements as well as to add new features in one function without having to change anything else in parallel. NetworkWorld’s Andy Patrizio pointed out in his predictions for 2019 that he expects “Microservices and serverless computing take off.”

Another important change is that these services are being hosted in geographically distributed enterprise data centers, in the cloud, or both. Furthermore, functions can now reside in a customer’s pocket or in some combination of cloud-based or corporate systems.

What Does This Mean for You?

Addressing these trends means that enterprise developers and operations staff have to make some serious changes to their traditional approach including the following:

  • Developers must be willing to learn technologies that better fits today’s rapid application development methodology. An experienced “student” can learn quickly through online schools. For example, Learnpython.org offers free courses in Python, while codecademy offers free courses in Ruby, Java, and other languages.
  • They must also be willing to learn how to decompose application logic from a monolithic, static design to a collection of independent, but cooperating, microservices. Online courses are available for this too. One example of a course designed to help developers learn to “think in microservices” comes from IBM. Other courses are available from Lynda.com.
  • Developers must adopt new tools for creating and maintaining microservices that support quick and reliable communication between them. The use of various commercial and open-source messaging and management tools can help in this process. Rancher Labs, for example, offers open-source software for delivering Kurbernetes-as-a-service.
  • Operations professionals need to learn orchestration tools for containers and Kubernetes to understand how they allow teams to quickly develop and improve applications and services without losing control over data and security. Operations has long been the gatekeepers for enterprise data centers. After all, they may find their positions on the line if applications slow down or fail.
  • Operations staff must allow these functions to be hosted outside of the data centers they directly control. To make that point, analysts at Market Research Future recently published a report saying that, “the global cloud microservices market was valued at USD 584.4 million in 2017 and is expected to reach USD 2,146.7 million by the end of the forecast period with a CAGR of 25.0%”.
  • Application management and security issues must now be part of developers’ thinking. Once again, online courses are available to help individuals to develop expertise in this area. LinkedIn, for example, offers a course in how to become an IT Security Specialist.

It is important for both IT and operations staff to understand that the world of IT is moving rapidly and everyone must be focused on upgrading their skills and enhancing their expertise.

How Do Microservices Benefit the Enterprise?

This latest move to distributed computing offers a number of real and measurable benefits to the enterprise. Development time and cost can be sharply reduced after the IT organization incorporates this form of distributed computing. Afterwards, each service can be developed in parallel and refined as needed without requiring an entire application to be stopped or redesigned.

The development organization can focus on developer productivity and still bring new application functions or applications online quickly. The operations organization can focus on defining acceptable rules for application execution and allowing the orchestration and management tools to enforce them.

What New Challenges Do Enterprises Face?

Like any approach to IT, the adoption of a microservices architecture will include challenges as well as benefits.

Monitoring and managing many “moving parts” can be more challenging than dealing with a few monolithic applications. The adoption of an enterprise management framework can help address these challenges. Security in this type of distributed computing needs to be top of mind as well. As the number of independent functions grows on the network, each must be analyzed and protected.

Should All Monolithic Applications Migrate to Microservices?

Some monolithic applications can be difficult to change. This may be due to technological challenges or may be due to regulatory constraints. Some components in use today may have come from defunct suppliers, making changes difficult or impossible.

It can be both time consuming and costly for the organization to go through a complete audit process. Often, organizations continue investing in older applications much longer than is appropriate in the belief that they’re saving money.

It is possible to evaluate what an monolithic application does to learn if some individual functions can be separated and run as smaller, independent services. These can be implemented either as cloud-based services or as container-based microservices.

Rather than waiting and attempting to address older technology as a whole, it may be wise to undertake a series of incremental changes to make enhancing or replacing an established system more acceptable. This is very much like the old proverb, “the best time to plant a tree was 20 years ago. The second best time is now.”

Is the Change Worth It?

Enterprises that have made the move towards the adoption of microservices-based application architectures have commented that their IT costs are often reduced. They also often point out that once their team mastered this approach, it was far easier and quicker to add new features and functions when market demands changed.

Source

A Comparison of VMware and Docker

Servers are expensive. And in single-application installations, most servers spend the majority of their time waiting. Making the most of these expensive assets led to virtualization, and making the most of virtualization has led to multiple options for virtualizing applications.

VMware and Docker offer competing methods for virtualizing applications. Both technologies work to make the most of limited hardware resources, but they do so in significantly different ways. This post will help you understand how they differ and how those differences affect which scenarios each is best suited for. In particular, we’ll take a brief look at how each works, what the differences mean for the application and the deploying team, and how those differences can have an impact on operations, security, and application performance.

This article is aimed at both IT operations and application development leaders who want to expand the options in their deployment toolkit. The information will help those leaders make more informed decisions and explain those decisions to colleagues and executives.

The Limits of Virtualization

VMware is a company with a wide variety of products, from those that virtualize a single application to those that manage entire data centers or clouds. In this article, we use “VMware” to refer to VMware vSphere, used to virtualize entire operating systems; many different operating systems, from various Linux distributions to Windows Server can be virtualized on a single physical server.

VMware is a type-1 hypervisor, meaning it sits between the virtualized operating system and the server hardware; a number of different operating systems can run on a single VMware installation, with OS-specific applications running on each OS instance.

Docker is a system for orchestrating, or managing, application containers. An application container virtualizes an application and the software libraries, services, and operating system components required to run it. All of the Docker containers in a deployment will run on a single operating system because they’ll share commonly used resources from that operating system. Sharing the resources means that the application container is much smaller than the full virtualized operating system created in VMware. That smaller software image in a container can typically be created much more quickly than the VMware operating system image — on the scale of seconds rather than minutes.

The key question for the deployment team is why virtualization is being considered. If the point of the shift is at the operating system level — to provide each user or user population with its own operating environment while requiring as few physical servers as possible — then VMware is the logical choice. If the focus is on the application, with the operating system hidden or irrelevant to the user, then Docker containers become a realistic option for deployment.

The Scale of Reuse

How much of each application do you want to reuse? The methods and scales of resource sharing are different for VMware and Docker containers, as one reuses images of operating systems while the other shares functions and resources from a single image. Those differences can translate to huge storage and memory requirements for applications.

Each time VMware creates a instance of an operating system, it creates a full copy of that operating system. All of the components of the operating system, and any resources used by applications running within the instance, are used only within that particular instance — there is no sharing among running operating systems. This means that there can be incredible customization of the environment within each operating system and applications can be run without concern about effecting (or being effected by) applications running in other virtual operating systems.

When a Docker container is created, it is a unique instance of the application with all of the libraries and code the application depends on included. While the application code is bundled within the container image, the application relies on — and is managed by — the host system’s kernel. This reduces the resources required to run containers and allows them to start very quickly.

Docker’s speed at creating new instances of an application makes it a solution commonly used in the development environment, where quickly launching, testing, and deleting copies of an application can make for much greater efficiencies. VMware’s ability to author a single “golden copy” of a fully patched and updated operating system and then use that image to create every new instance makes it popular in enterprise production deployments.

In both VMware and Docker containers, a “master copy” of the original environment is created and used to deploy multiple copies. The question for the operations team is whether the resource efficiency of Docker matches the needs of the application and the user base, or whether those needs require a unique copy of the operating system to be launched and deployed for each instance.

Automation as a Principle

While the processes of creating and tearing down operating system images can be automated, automation is baked into the very heart of Docker. Orchestration, as part of the DevOps toolbox, is a major differentiator for Docker containers versus VMware.

Docker is itself the orchestration mechanism for creating new application instances on demand and then shutting them down when the requirement ends. There are API integrations that allow Docker to be controlled by a number of different automation systems. And for large computing environments that use Docker containers, additional layers of automation and management have been developed. One well-known platform is Kubernetes, which was developed to manage clusters of Docker containers that may be spread across many different servers.

VMware has a wide variety of automation tools as well, but those tools are, when discussing the vSphere family of products, responsible for creating new instances of operating systems, not applications. This means that the time to create an entirely new operating system image must be considered when planning rapid-response cloud and virtual system application environments. VMware can certainly work to support those environments; it’s used in many commercial operations to do just that. But it requires additional applications or frameworks to automate and orchestrate the process, adding complexity and expense to the solution.

It’s important to note that both Docker containers and VMware can operate quite successfully without automation. When it comes to a commercial installation, though, each becomes much more powerful when the tasks of creating and deleting new operating system and application instances are controlled by software rather than human hands. From rapid response to increased user demand, to large-scale automated application testing, system automation is important. Knowing what’s required for that automation is critical when deciding between technologies.

Separation — or Not

If speed of deployment and execution or limitations on resource usage aren’t critical differentiators for your deployments, then hard separation between applications and instances might be. Just as orchestration is baked into Docker, separation is baked into VMware.

Each instance of an operating system virtualized under VMware is a complete operating system image running on hardware resources that are not shared logically with any other instance of the operating system. VMware partitions the hardware resources in ways that make each operating system instance believe that it’s the only OS running on the server.

This means that, barring a critical hypervisor vulnerability, there is no realistic way for an application running on one virtual server to reach across into another virtual server for data or resources. It also means that things can go awfully, terribly wrong in one virtual server and it can be shut down without endangering the operation of any of the other virtual servers running under VMware.

While proponents of Docker have spoken of similar separation being part of the container system’s architecture, recent vulnerability reports (such as CVE-2019-5736) indicate that Docker’s separation might not be as complete as operational IT specialists would hope.

Separation is not as high of a priority for Docker containers as it is for VMware operating system instances. Application containers will share resources; and where there is sharing, there are limits on separation.

Conclusion

There are significant differences between the virtualization and deployment of VMware and Docker, each with its uses. Readers should now have a basic understanding of the basic nature and capability of each platform and of the factors that could make each preferable in a given situation.

Where speed of deployment and most effective use of limited resources are the highest priorities, Docker containers show a great deal of strength. In situations like development groups or the rapid iteration of a fully functioning DevOps environment, containers can be tremendously valuable.

If security and stability are critically important in your production environment, VMware offers both. For both Docker containers and VMware, multiple products are available to extend their functionality through automation, orchestration, and other functions.

You can find more information on deploying Docker in this blog post. The article presents both best practices and hands-on details for putting the platforms in the field, as well as information on how to include each within a DevOps methodology.

Source

A Reflection on the Kubernetes Market

Running a young and growing company in the Kubernetes space means travelling at high speed in an ever-changing market. We are heading into our fourth year of business, and around this time of year I like to step back from the noise and figure out some of the larger trends I’m seeing develop.

I am not a technologist by background, so my thoughts tend to be more commercial in nature. If you’re interested, I wrote a similar post last year.

1.) Kubernetes is Dead, Long Live Kubernetes

Kubernetes continues to sweep the market. In fact, it’s gathering pace as a buzzword. I’m regularly told by tech leaders that if they didn’t use Kubernetes, they wouldn’t find engineers to work with them!

Although this fervour has been a bemusing aspect of the rise of Kubernetes, for the early adopters, I recognise a sense that Kubernetes is no longer the ‘thing’. Their attitude is now one of focusing on how they unlock the value of running services on Kubernetes. Not for PoCs or test services, but now as a kernel for their whole technology platform.

As Kubernetes matures, it will inevitably be commoditised, but crucially it will increase in importance as we rely on it to run entire businesses.

Matt-next

Matt Bates speaking at Google NEXT 2018

2.) We’re Moving up the Stack

Jetstack started as a way for companies to access high quality expertise around Kubernetes and Docker.

Entering the market early has given us a wonderful opportunity to grow with the community. Our company reflects the maturity of the ecosystem, just as an open source project strengthens with users and contributions.

At last, we are able to have conversations about the more holistic benefits of cloud native. Smart teams no longer worry about Kubernetes as a decision they should or shouldn’t take; they can now think about maximising the value their users will get out of it. They are also starting to understand how Kubernetes and related technologies can help them to build a platform that enables them to innovate and compete. In a number of cases, this means entering new markets and creating new product lines.

It’s taken longer than expected, but the thought that we would get to a standardised ‘stack’ with Kubernetes as the foundation was always the hope. We’re now consistently seeing certain ‘de-facto’ technologies in conversations with customers (i.e. Prometheus, Calico), and others that are being mentioned regularly (i.e. Istio, Spinnaker).

The idea that we may be able to offer a more standardised set of products formed around Kubernetes was even something we were considering back in 2014 when naming the company – the Jet ‘Stack’.

However, the exact mix of products you will need in your company will likely always need refinement based on requirements and developments in an ever-evolving ecosystem. In 2019 we’ll introduce a formalised approach to this discovery, and a reference architecture to help navigate the complexity. Our goal is to accelerate cloud native adoption and success in the same way we do so for Kubernetes.

team-retro

Kubernetes Operational Wargaming at NEXT 2018

3.) The Trough of Despondency

No matter how good Kubernetes is, confusion still abounds on how to deploy and operate it.

A look at the Cloud Native Landscape shows the difficulties involved in choosing a path, and this plays out with customers at all stages of the journey.

Questions abound:

  • Is EKS mature enough?
  • How do I secure it?
  • Do I consider a GitOps approach?
  • Is Azure a viable platform on which to run K8s?
  • Does Google have the enterprise knowledge to support me?
  • Is bare metal too complex to try?
  • Is consistent multi-cloud operation actually feasible?
  • Should I go kops or is kubeadm worth trying now?

With more vendors entering the market and marketing efforts ramping up, this confusion seems only to be increasing. In some instances, we’ve noticed elements of a backlash to Kubernetes’ complexity of software and ecosystem.

If Kubernetes continues to grow as a buzzword, so will it’s propensity to be sold as a ‘silver bullet’, and frustration will abound as enterprises realise just how much goes into a production-ready cluster.

Fortunately amongst business technologists, these developments just seem to be taken as an inevitable part of wider adoption and maturing of the technology.

Most people I see frustrated by the short term issues of even finding Kubernetes talent recognise the long-term value that Kubernetes can provide. One thing I often hear is that Kubernetes has the opportunity to take on the breadth of success seen by Linux or virtual machines and we only saw similar patterns in those.

The advice for now? Just keep going.

pingpong

Jetstack ping pong championships, Dublin 2018

Source