Cloud Native in 2019: A Look Ahead

Cloud Native has been around for a few years now, but 2018 was the year Cloud Native crossed the chasm to go truly mainstream. From the explosion in the number of projects making up the CNCF landscape, to IBM’s $34 billion purchase of Red Hat under their Hybrid Cloud division, the increasingly wide adoption of Kubernetes, capped off by CloudNativeCon in Seattle with eight thousand attendees, the past year Cloud Native became the major trend in the industry.

So then the question now is, “what will happen in 2019”?

We have seen strong indications of where the sector is heading, and how it will grow over the next year. The large surge in popularity and adoption of Cloud Native technologies such as Kubernetes has led to many new use cases. New use cases result in new challenges, which translate into new requirements and tools. If I were to pick one main theme for this year it would be security. Many of the forthcoming tools, features, products are specifically focused on addressing security concerns.

Here are a few of the trends we expect to see in the year ahead:

Cloud Native meets the Enterprise – We will see continued adoption of Kubernetes and Cloud Native tech by large enterprise companies. This will have a two-way effect. Cloud Native will continue to influence enterprise companies. They will need to adjust their culture and ways of working in order to properly leverage the technologies. From their end, enterprises will have a strong influence on shaping the future of Cloud Native. We predict a continued focus on features regarding security. An additional effect from the pressure around security and reliability will potentially lead to slowing down both speed and release cycles of some of the major projects.

The Year of Serverless (really) – We predict serverless/FaaS is finally going to live up to the hype this year. It feels like we’ve been talking about serverless for a long time already, but except for some niche use cases it hasn’t really lived up the expectations so far. Up until recently, AWS Lambda has been pretty much the only proven, production scale option. With the rise of new Kubernetes-focused tools such as KNative and promising platforms such as OpenFaas, serverless becomes much more attractive for companies who don’t run on AWS (or are wary of vendor lock in).

Not Well Contained – While last year it seemed that we were headed to ‘everything in a container’, more recently we’ve seen many VM/Container hybrid technologies, such as AWS Firecracker, Google’s gVisor and Kata Containers. These technologies are arising mostly for reasons around security and multi-tenancy. They offer stronger isolation while keeping performance similar to containers. We expect to see a move away from trend of running everything as a Docker container. Thanks to standardization on Open Container Initiative specifications and the Container Runtime Interface these new runtimes can be abstracted away from the applications and the developers writing them. We will continue to use Docker (or other OCI compliant) images for our applications, it will be the underlying runtimes which will likely change.

Managing state – The early use cases of Cloud Native focused heavily around stateless applications. While Kubernetes made persistent storage a first class citizen from early on, properly migrating stateful workloads to containers and Kubernetes was fraught with risks and new challenges. The ephemeral, dynamic and distributed nature of cloud native systems make the prospect of running stateful services, specifically databases, on Kubernetes worrying to say the least. This led to many to be content migrating their stateless applications, while avoiding moving over stateful services at all costs. More recently, we have seen a large increase in the number of open source tools, products and services for managing state in the cloud native world. Combined with a better understanding by the community, 2019 looks to be the year when we are comfortable running our critical stateful services on Kubernetes.

We’re excited to see what kind of surprises this year brings as well. Will Kubernetes maintain its dominance? Could a new technology come along to take away its steam? Unlikely, though not impossible as there has been considerable criticism of it’s complexity, with some more lightweight options popping up. Will unikernels finally see more adoption as we continue to focus on security and resource optimisation? Will the CNCF landscape continue exploding in size, or will we finally see a paring down of tooling and more standardization?

No matter whatever surprises that may yet pop up, we’re confident in saying Cloud Native will continue to be the major trend in IT over the next year…and many more, as CN increasingly influences how everyday enterprises run their technology for years to come.

Source

APIServer dry-run and kubectl diff

APIServer dry-run and kubectl diff

Author: Antoine Pelisse (Google Cloud, @apelisse)

Declarative configuration management, also known as configuration-as-code, is
one of the key strengths of Kubernetes. It allows users to commit the desired state of
the cluster, and to keep track of the different versions, improve auditing and
automation through CI/CD pipelines. The Apply working-group
is working on fixing some of the gaps, and is happy to announce that Kubernetes
1.13 promoted server-side dry-run and kubectl diff to beta. These
two features are big improvements for the Kubernetes declarative model.

Challenges

A few pieces are still missing in order to have a seamless declarative
experience with Kubernetes, and we tried to address some of these:

  • While compilers and linters do a good job to detect errors in pull-requests
    for code, a good validation is missing for Kubernetes configuration files.
    The existing solution is to run kubectl apply –dry-run, but this runs a
    local dry-run that doesn’t talk to the server: it doesn’t have server
    validation and doesn’t go through validating admission controllers. As an
    example, Custom resource names are only validated on the server so a local
    dry-run won’t help.
  • It can be difficult to know how your object is going to be applied by the
    server for multiple reasons:

    • Defaulting will set some fields to potentially unexpected values,
    • Mutating webhooks might set fields or clobber/change some values.
    • Patch and merges can have surprising effects and result in unexpected
      objects. For example, it can be hard to know how lists are going to be
      ordered once merged.

The working group has tried to address these problems.

APIServer dry-run

APIServer dry-run was implemented to address these two problems:

  • it allows individual requests to the apiserver to be marked as “dry-run”,
  • the apiserver guarantees that dry-run requests won’t be persisted to storage,
  • the request is still processed as typical request: the fields are
    defaulted, the object is validated, it goes through the validation admission
    chain, and through the mutating admission chain, and then the final object is
    returned to the user as it normally would, without being persisted.

While dynamic admission controllers are not supposed to have side-effects on
each request, dry-run requests are only processed if all admission controllers
explicitly announce that they don’t have any dry-run side-effects.

How to enable it

Server-side dry-run is enabled through a feature-gate. Now that the feature is
Beta in 1.13, it should be enabled by default, but still can be enabled/disabled
using kube-apiserver –feature-gates DryRun=true.

If you have dynamic admission controllers, you might have to fix them to:

  • Remove any side-effects when the dry-run parameter is specified on the webhook request,
  • Specify in the sideEffects
    field of the admissionregistration.k8s.io/v1beta1.Webhook object to indicate that the object doesn’t
    have side-effects on dry-run (or at all).

How to use it

You can trigger the feature from kubectl by using kubectl apply
–server-dry-run, which will decorate the request with the dryRun flag
and return the object as it would have been applied, or an error if it would
have failed.

Kubectl diff

APIServer dry-run is convenient because it lets you see how the object would be
processed, but it can be hard to identify exactly what changed if the object is
big. kubectl diff does exactly what you want by showing the differences between
the current “live” object and the new “dry-run” object. It makes it very
convenient to focus on only the changes that are made to the object, how the
server has merged these and how the mutating webhooks affects the output.

How to use it

kubectl diff is meant to be as similar as possible to kubectl apply:
kubectl diff -f some-resources.yaml will show a diff for the resources in the yaml file. One can even use the diff program of their choice by using the KUBECTL_EXTERNAL_DIFF environment variable, for example:

KUBECTL_EXTERNAL_DIFF=meld kubectl diff -f some-resources.yaml

What’s next

The working group is still busy trying to improve some of these things:

  • Server-side apply is trying to improve the apply scenario, by adding owner
    semantics to fields! It’s also going to improve support for CRDs and unions!
  • Some kubectl apply features are missing from diff and could be useful, like the ability
    to filter by label, or to display pruned resources.
  • Eventually, kubectl diff will use server-side apply!

Source

Using New Relic, Splunk, AppDynamics and Netuitive for Container Monitoring

Monitoring
Icons

If you use containers as part of your day-to-day operations, you need to
monitor them — ideally, by using a monitoring solution that you
already have in place, rather than implementing an entirely new tool.
Containers are often deployed quickly and at a high volume, and they
frequently consume and release system resources at a rapid rate. You
need to have some way of measuring container performance, and the impact
that container deployment has on your system. In this article, we’ll
take a look at four widely used monitoring
platforms—Netuitive, NewRelic, Splunk, and AppDynamics—that support containers, and compare how they measure up when it comes to monitoring containers.
First, though, a question: When you monitor containers, what kind of
metrics do you expect to see? The answer, as we’ll see below, varies
with the monitoring platform. But in general, container metrics fall
into two categories—those that measure overall container impact on the
system, and those that focus on the performance of individual
containers.

Setting up the Monitoring System

The first step in any kind of software monitoring, of course, is to
install the monitoring service. For all of the platforms covered in this
article, you can expect additional steps for setting up standard
monitoring features. Here we cover only those directly related to
container monitoring. Setup: New Relic With New Relic, you start by
installing New Relic Servers for Linux,
which includes integrated Docker monitoring. It should be installed on
the Docker server, rather than the Docker container. The Servers for
Linux package is available for most common Linux distributions; Docker
monitoring, however, requires a 64-bit system. After you install New
Relic Servers for Linux, you will need to create a docker group (if it
doesn’t exist), then add the newrelic user to that group. You may need
to do some basic setup after that, including (depending on the Linux
distribution) setting the location of the container data files and
enabling memory metrics. Setup: Netuitive Netuitive also requires
you to install its Linux monitoring agent on the Docker server. You then
need to enable Docker metrics collection in the Linux Agent config file,
and optionally limit the metrics and/or containers by creating a
regex-based blacklist or whitelist. As with New Relic, you may wind up
setting a few additional options. Netuitive, however, offers an
additional installation method. If you are unable to install the Linux
Agent on the Docker server, you can install a Linux Agent Docker
container, which will then do the job of monitoring the host and
containers. Setup: Splunk Splunk takes a very different approach to
container monitoring. It uses a Docker API logging driver to send
container log data directly to Splunk Enterprise and Splunk Cloud via
its HTTP Event Collector. You specify the splunk driver (and its
associated options) on the Docker command line. Splunk’s monitoring, in
other words, is integrated directly with Docker, rather than with the
host system. The Splunk log driver requires an HTTP Event Collector
token, and the path (plus port) to the user’s Splunk Cloud/Splunk
Enterprise instance. It also takes several optional arguments. Setup:
AppDynamics AppDynamics uses a Docker Monitoring extension
to send Docker Remote API metrics to its Machine Agent. In some ways,
this places it in a middle ground between New Relic and Netuitive’s
agent-based monitoring and Splunk’s close interaction with Docker.
AppDynamics’ extension installation, however, is much more hands-on.
The instructions suggest that you can expect to come out of it with
engine grease up to your elbows, and perhaps a few scraped knuckles. You
must first manually bind Docker to the TCP Socket or the Unix Socket.
After that, you need to install the Machine Agent, followed by the
extension. You then need to manually edit several sections of the config
file, including the custom dashboard. You must also set executable
permissions, and AppDynamics asks you to review both the Docker Remote
API and the Docker Monitoring extension’s socket command file. The
AppDynamics instructions also include troubleshooting instructions for
most of the installation steps.

What You Get

As you might expect, there are some significant differences in the
metrics which each of these platforms monitors and displays. Output:
New Relic New Relic displays Docker container metrics as part of its
Application Performance Monitoring (APM) Overview page; containers are
grouped host servers when Servers for Linux is installed, and are
indicated by a container symbol. The overview page includes drill-down
for detailed performance features. The New Relic Servers monitor
includes a Docker page, which shows the impact of Docker containers on
the server’s performance. It displays server-related metrics for
individual Docker images, with drill-down to image details. It does not,
however, display data for individual containers. Output: Netuitive
Netuitive’s Docker monitor collects a variety of metrics, including
several related to CPU and network performance, and almost two dozen
involving memory. It also collects computed CPU, memory, and throttling
percentages. With Netuitive, you build a dashboard by assembling
widgets, so the actual data shown (and the manner in which it is
displayed) will depend on your choice of widgets and their
configuration. Output: Splunk Splunk is designed to use data from a
wide range of logs and related sources; for containers, it pulls data
from individual container logs, from Docker and cloud APIs, and from
application logs. Since Splunk integrates such a large amount of data at
the cloud and enterprise level, it is up to the user to configure
Splunk’s analysis and monitoring tools to display the required data.
For containers, Splunk recommends looking at CPU and memory use,
downtime/outage-related errors, and specific container and application
logs to identify problem containers. Output: AppDynamics AppDynamics
reports basic container and system statistics (individual container
size, container and image count, total memory, memory limit, and swap
limit), along with various ongoing network, CPU, and memory statistics.
It sends these to the dashboard, where they are displayed in a series of
charts.

Which Service Should You Use?

When it comes to the question of which monitoring service is right for
your container deployment, there’s no single answer. For most
container-based operations, including Rancher-managed operations on a
typical Linux distribution, either New Relic or Netuitive should do
quite well. With reasonably similar setup and monitoring features, the
tradeoff is between New Relic’s preconfigured dashboard pages and the
do-it-yourself customizability of Netuitive’s dashboard system. For
enterprise-level operations concerned with integrated monitoring of
performance at all scales, from system-level down to individual
container and application logs, Splunk is the obvious choice. Since
Splunk works directly with the Docker API, it is also likely to be the
best option for use with minimal-feature RancherOS deployments. If, on
the other hand, you simply want to monitor container performance via the
Docker API in a no-frills, basic way, the AppDynamics approach might
work best for you. So there it is: Look at what kind of container
monitoring you need, and take your pick.

Source

Get started with OpenFaaS and KinD

In this post I want to show you how I’ve started deploying OpenFaaS with the new tool from the Kubernetes community named Kubernetes in Docker or KinD. You can read my introductory blog post Be KinD to Yourself here.

The mission of OpenFaaS is to Make Serverless Functions Simple. It is open-source and built by developers, for developers in the open with a growing and welcoming community. With OpenFaaS can run stateless microservices and functions with a single control-plane that focuses on ease of use on top of Kubernetes. The widely accepted OCI/Docker image format is used to package and deploy your code and can be run on any cloud.

Over the past two years more than 160 developers have contributed to code, documentation and packaging. A large number of them have also written blog posts and held events all over the world.

Find out more about OpenFaaS on the blog or GitHub openfaas/faas

Pre-reqs

Unlike prior development environments for Kubernetes such as Docker for Mac or minikube – the only requirement for your system is Docker which means you can install this almost anywhere you can get docker installed.

This is also a nice experience for developers because it’s the same on MacOS, Linux and Windows.

On a Linux host or Linux VM type in $ curl -sLS https://get.docker.com | sudo sh.

Download Docker Desktop for Windows or Mac.

Create your cluster

Install kubectl

The kubectl command is the main CLI needed to operate Kubernetes.

I like to install it via the binary release here.

Get kind

You can get the latest and greatest by running the following command (if you have Go installed locally)

$ go get sigs.k8s.io/kind

Or if you don’t want to install Golang on your system you can grab a binary from the release page.

Create one or more clusters

Another neat feature of kind is the ability to create one or more named clusters. I find this useful because OpenFaaS ships plain YAML files and a helm chart and I need to test both independently on a clean and fresh cluster. Why try to remove and delete all the objects you created between tests when you can just spin up an entirely fresh cluster in about the same time?

$ kind create cluster --name openfaas

Creating cluster 'kind-openfaas' ...
 ✓ Ensuring node image (kindest/node:v1.12.2) 🖼 
 ✓ [kind-openfaas-control-plane] Creating node container 📦 
 ✓ [kind-openfaas-control-plane] Fixing mounts 🗻 
 ✓ [kind-openfaas-control-plane] Starting systemd 🖥
 ✓ [kind-openfaas-control-plane] Waiting for docker to be ready 🐋 
⠈⡱ [kind-openfaas-control-plane] Starting Kubernetes (this may take a minute) ☸ 

Now there is something you must not forget if you work with other remote clusters. Always, always switch context into your new cluster before making changes.

$ export KUBECONFIG="$(kind get kubeconfig-path --name="openfaas")"

Install OpenFaaS with helm

Install helm and tiller

The easiest way to install OpenFaaS is to use the helm client and its server-side equivalent tiller.

  • Create a ServiceAccount for Tiller:
$ kubectl -n kube-system create sa tiller \
 && kubectl create clusterrolebinding tiller \
      --clusterrole cluster-admin \
      --serviceaccount=kube-system:tiller
  • Install the helm CLI
$ curl -sLSf https://raw.githubusercontent.com/helm/helm/master/scripts/get | sudo bash
  • Install the Tiller server component
helm init --skip-refresh --upgrade --service-account tiller

Note: it may take a minute or two to download tiller into your cluster.

Install the OpenFaaS CLI

$ curl -sLSf https://cli.openfaas.com | sudo sh

Or on Mac use brew install faas-cli.

Install OpenFaaS

You can install OpenFaaS with authentication on or off, it’s up to you. Since your cluster is running locally you may want it turned off. If you decide otherwise then checkout the documentation.

  • Create the openfaas and openfaas-fn namespaces:
$ kubectl apply -f https://raw.githubusercontent.com/openfaas/faas-netes/master/namespaces.yml
  • Install using the helm chart:
$ helm repo add openfaas https://openfaas.github.io/faas-netes && \
    helm repo update && \
    helm upgrade openfaas --install openfaas/openfaas \
      --namespace openfaas  \
      --set basic_auth=false \
      --set functionNamespace=openfaas-fn \
      --set operator.create=true

The command above adds the OpenFaaS helm repository, then updates the local helm library and then installs OpenFaaS locally without authentication.

Note: if you see Error: could not find a ready tiller pod then wait a few moments then try again.

You can fine-tune the settings like the timeouts, how many replicas of each service run, what version you are using and more using the Helm chart readme.

Check OpenFaaS is ready

The helm CLI should print a message such as: To verify that openfaas has started, run:

$ kubectl --namespace=openfaas get deployments -l "release=openfaas, app=openfaas"

The KinD cluster will now download all the core services that make up OpenFaaS and this could take a few minutes if you’re on WiFi, so run the command above and look out for “AVAILABLE” turning to 1 for everything listed.

Access OpenFaaS

Now that you’ve setup a cluster and OpenFaaS it’s time to access the UI and API.

First forward the port of the gateway to your local machine using kubectl.

$ kubectl port-forward svc/gateway -n openfaas 8080:8080

Note: If you already have a service running on port 8080, then change the port binding to 8888:8080 for instance. You should also run export OPENFAAS_URL=http://127.0.0.1:8888 so that the CLI knows where to point to.

You can now use the OpenFaaS CLI and UI.

Open the UI at http://127.0.0.1:8080 and deploy a function from the Function store – a good example is “CertInfo” which can check when a TLS certificate will expire.

Downloading your chosen image may take a few seconds or minutes to deploy depending on your connection.

  • Invoke the function then see its statistics and other information via the CLI:
$ faas-cli list -v
  • Deploy figlet which can generate ASCII text messages, try it out.
$ faas-cli store deploy figlet
$ echo Hi! | faas-cli invoke figlet

You can use the describe verb for more information and to find your URL for use with curl or other tools and services.

$ faas-cli describe figlet

Use the OpenFaaS CRD

You can also use the OpenFaaS Custom Resource Definition or CRD by typing in:

$ kubectl get functions -n openfaas-fn

When you create a new function for OpenFaaS you can use the CLI which calls the RESTful API of the OpenFaaS API Gateway, or generate a CRD YAML file instead.

  • Here’s an example with Node.js:

Change the --prefix do your own Docker Hub account or private Docker registry.

$ mkdir -p ~/dev/kind-blog/ && \
  cd ~/dev/kind-blog/ && \
  faas-cli template store pull node10-express && \
  faas-cli new --lang node10-express --prefix=alexellis2 openfaas-loves-crds

Our function looks like this:

$ cat openfaas-loves-crds/handler.js

"use strict"

module.exports = (event, context) => {
    let err;
    const result =             {
        status: "You said: " + JSON.stringify(event.body)
    };

    context
        .status(200)
        .succeed(result);
}

Now let’s build and push the Docker image for our function

$ faas-cli up --skip-deploy -f openfaas-loves-crds.yml 

Followed by generating a CRD file to apply via kubectl instead of through the OpenFaaS CLI.

$ faas-cli generate crd  -f openfaas-loves-crds.yml 
---
apiVersion: openfaas.com/v1alpha2
kind: Function
metadata:
  name: openfaas-loves-crds
  namespace: openfaas-fn
spec:
  name: openfaas-loves-crds
  image: alexellis2/openfaas-loves-crds:latest

You can then pipe this output into a file to store in Git or pipe it directly to kubectl like this:

$ faas-cli generate crd  -f openfaas-loves-crds.yml | kubectl apply -f -
function.openfaas.com "openfaas-loves-crds" created

$ faas-cli list -v
Function                      	Image                                   	Invocations    	Replicas
openfaas-loves-crds           	alexellis2/openfaas-loves-crds:latest   	0              	1    

Wrapping up

KinD is not the only way to deploy Kubernetes locally, or the only way to deploy OpenFaaS, but it’s quick and easy and you could even create a bash script to do everything in one shot.

  • If you’d like to keep learning then checkout the official workshop which has been followed by hundreds of developers around the world already.
  • Join Slack if you’d like to chat more or contribute to the project Slack

You can also read the docs to find out how to deploy to GKE, AKS, DigitalOcean Kubernetes, minikube, Docker Swarm and more.

Source

Hidden Dependencies with Microservices | Rancher Labs

Hidden Dependencies with Microservices


One of the great things about microservices is that they allow engineering to decouple software development from application lifecycle. Every microservice:

  • can be written in its own language, be it Go, Java, or Python
  • can be contained and isolated form others
  • can be scaled horizontally across additional nodes and instances
  • is owned by a single team, rather than being a shared responsibility among many teams
  • communicates with other microservices through an API a message bus
  • must support a common service level agreement to be consumed by other microservices, and conversely, to consume other microservices

These are all very cool features, and most of them help to decouple various software dependencies from each other. But what is the operations point of view? While the cool aspects of microservices bulleted above are great for development teams, they pose some new challenges for DevOps teams. Namely:

Scalability: Traditionally, monolithic applications and systems scaled vertically, with low dynamism in mind. Now, we need horizontal architectures to support massive dynamism – we need infrastructure as code (IaC). If our application is not a monolith, then our infrastructure cannot be, either.

Orchestration: Containers are incredibly dynamic, but they need resources – CPU, memory, storage (SDS) and networking (SDN) when executed. Operations and DevOps teams need a new piece of software that knows which resources are available to run tasks fairly (if sharing resources with other teams), and efficiently.

System Discovery: To merge dynamism and orchestration, you need a system discovery service. With microservices and containers, one can implement still use a configuration management database (CMDB), but with massive dynamism in mind. A good system has to be aware of every container deployment change, able to get or receive information from every container (metadata, labels), and provide a method for making this info available.

There are many tools in the ecosystem one can choose. However, the scope of this article is not to do a deep dive into these tools, but to provide an overview of how to reduce dependencies between microservices and your tooling.

Scenario 1: Rancher + Cattle

Consider the following scenario, where a team is using Rancher. Rancher is facilitating infrastructure as code, and using Cattle for orchestration, and Rancher discovery (metadata, DNS, and API) for managing system discovery. Assume that the DevOps team is familiar with this stack, but must begin building functionality for the application to run. Let’s look at the dependency points they’ll need to consider:

  1. The IaC tool shouldn’t affect the development or deployment of microservices. This layer is responsible for providing, booting, and enabling communication between servers (VMs or bare metal). Microservices need servers to run, but it doesn’t matter how those servers were provided. We should be able to change our IaC method without affecting microservices development or deployment paths.
  2. Microservice deployments are dependent on orchestration. The development path for microservices could be the same, but the deployment path is tightly coupled to the orchestration service, due to deployment syntax and format. There’s no easy way to avoid this dependency, but it can be minimized by using different orchestration templates for specific microservice deployments.
  3. Microservice developments could be dependent on system discovery. It depends on the development path.

Points (1) and (2) are relatively clear, but let’s take a closer look at (3). Due to the massive dynamism in microservices architectures, when one microservice is deployed, it must be able to retrieve its configuration easily. It also needs to know where its peers and collaborating microservices are, and how they communicate (which IP, and via which port, etc). The natural conclusion is that for each microservice, you also define logic coupling it with service discovery. But what happens if you decide to use another orchestrator or tool for system discovery? Consider a second system scenario:

Scenario 2: Rancher + Kubernetes + etcd

In the second scenario, the team is still using Rancher to facilitate Infrastructure as Code. However, this team has instead decided to use Kubernetes for orchestration and system discovery (using etcd). The team would have to create Kubernetes deployment files for microservices (Point 2, above), and refactor all the microservices to talk with Kubernetes instead of Rancher metadata (Point 3). The solution is to decouple services from configuration. This is easier said than done, but here is one possible way to do it:

  • Define a container hierarchy
  • Separate containers into two categories: executors and configurators
  • Create generic images based on functionality
  • Create application images from the generic executor images; similarly, create config-apps images from the generic configurator images
  • Define logic for running multiple images in a collaborative/accumulative mode

Here’s an example with specific products to clarify these steps: container-hierarchiesIn the image above, we’ve used:

  • base: alpine Built from Alpine Linux, with some extra packages: OpenSSL, curl, and bash. This has the smallest OS footprint with package manager, based in uClib, and it’s ideal for containerized microservices that don’t need glibc.
  • executor: monit. Built from the base above, with monitoring installed under /opt/monit. It’s written in C with static dependencies, a small 2 MB footprint, and cool features.
  • configurator: confd. Built from the base above, with confd and useful system tools under /opt/tools. It’s written in Golang with static dependencies, and provides an indirect path for system discovery, due to supporting different backends like Rancher, etcd, and Consul.

The main idea is to keep microservices development decoupled from system discovery; this way, they can run on their own, or complemented by another tool that provides dynamic configuration. If you’d like to test out another tool for system discovery (such as etcd, ZooKeeper, or Consul), then all you’ll have to do is develop a new branch for the configurator tool. You won’t need to develop another version of the microservice. By avoiding reconfiguring the microservice itself, you’ll be able to reuse more code, collaborate more easily, and have a more dynamic system. You’ll also get more control, quality, and speed during both development and deployment. To learn more about hierarchy, images, and build I’ve used here, you can access this repoon GitHub. Within are additional examples using Kafka and Zookeeper packages; these service containers can run alone (for dev/test use cases), or with Rancher and Kubernetes for dynamic configuration with Cattle or Kubernetes/etcd. Zookeeper (repo here, 67 MB)

  • With Cattle (here)
  • With Kubernetes (here)

Kafka (repo here, 115 MB)

Conclusion

Orchestration engines and system discovery services can result in “hidden” microservice dependencies if not carefully decoupled from microservices themselves. This decoupling makes it easier to develop, test and deploy microservices across infrastructure stacks without refactoring, and in turn allows users to build wider, more open systems with better service agreements. Raul Sanchez Liebana (@rawmindNet) is a DevOps engineer at Rancher.

Source

Codefresh versus Jenkins X – Codefresh

In a previous blog post, we saw how Codefresh compared to Jenkins. In that post, the major takeaway is the fact that Codefresh is a solution for both builds and deployments (CI/CD) while Jenkins does not support any deployments on its own (only CI).

Jenkins X has recently been announced and it is has been introduced as a native CI/CD solution for Kubernetes. It adds deployment capabilities to plain Jenkins and also makes entities such as environments and deployments a first-class citizen.

In theory, Jenkins X is a step in the right direction especially for organizations that are moving into containers and Kubernetes clusters. It therefore makes sense to see how this new project stands up against Codefresh, the CI/CD solution that was developed with Docker/Helm/Kubernetes support right from its inception.

In practice, Jenkins X has some very strong opinions on the software lifecycle which might not always agree with the processes of your organization.

Jenkins X is still using Jenkins 2.x behind the scenes inheriting all its problems

This is probably the most important point to understand regarding Jenkins X. Jenkins X is NOT a new version of Jenkins, or even a rewrite. It is only a collection of existing services that still include Jenkins at its core. If you deploy Jenkins X on a cluster you can easily look at all the individual components:

Jenkins X componentsJenkins X components

Jenkins X is just Jenkins plus Chartmuseum, Nexus, Mongo, Monocular, etc. The main new addition is the jx executable which is responsible for installing and managing JX installations.

Fun fact: Chartmuseum is actually a Codefresh project!

Jenkins X uses a Codefresh projectJenkins X uses a Codefresh project

This means that Jenkins X is essentially a superset of plain Jenkins. Of course it adds new deployment abilities to the mix but it also inherits all existing problems. Most points we have mentioned in the original comparison are still true:

  • Plugins and shared libraries are still present
  • Upgrading and configuring the Jenkins server is still an issue
  • Pipelines are still written in Groovy instead of declarative YAML

You can even visit the URL of Jenkins in a Jenkins X installation and see the familiar UI.

Jenkins 2 inside Jenkins XJenkins 2 inside Jenkins X

What is more troubling is that all Jenkins configuration options are still valid. So if you want to change the default Docker registry, for example, you still need to manage it via the Jenkins UI.

Another downside of the inclusion of all these off-the-shelf tools is the high system requirements. Gone are the days where you could just download the Jenkins war on your laptop and try it out. I tried to install Jenkins X on my laptop and failed simply because of lack of resources.

The recommended installation mode is to use a cloud provider with 5 nodes of 20-30GBs or RAM. I tried to get away with just 3 nodes and was not even able to compile/deploy the official quickstart application.

Jenkins X deploys only on Kubernetes clusters, Codefresh can deploy anywhere

Kubernetes popularity is currently exploding and JenkinsX contains native support for Kubernetes deployments. The problem, however, is that JenkinsX can ONLY deploy on Kubernetes clusters and nothing else.

All new JenkinsX concepts such as environments, previews, and promotions are always targeting a namespace in a Kubernetes cluster. You can still use Jenkins X for compiling and packaging any kind of application, but the continuous delivery part is strictly constrained to Kubernetes clusters.

Codefresh, on the other hand, can deploy everywhere. Even though there is native support for Helm and Kubernetes dashboards, you can use Codefresh to deploy to Virtual machines, Docker Swarm, Bare Metal clusters, FTP sites, and any other deployment target you can imagine.

This means that Codefresh has a clear advantage for organizations that are migrating to Kubernetes while still having legacy applications around, as Codefresh can be used in a gradual way.

Jenkins X requires Helm, in Codefresh Helm is optional

Helm is the package manager for Kubernetes that allows you to group multiple microservices together, manage templates for Kubernetes manifests and also perform easy rollbacks to previous releases.

Codefresh has native support for Helm by offering a private Helm repository and a dedicated Helm dashboard.

Codefresh Helm DashboardCodefresh Helm Dashboard

We love Helm and we believe it is the future of Kubernetes deployments. However, you don’t have to adopt Helm in order to use Codefresh.

In Codefresh the usage of Helm is strictly optional. As mentioned in the previous section you can use Codefresh to deploy anywhere including plain Kubernetes clusters without Helm installed. We have several customers that are using Kubernetes without Helm and some of the facilities we offer such as blue/green and canary deployments are designed with that in mind.

Jenkins X, on the other hand, REQUIRES the adoption of Helm. All deployments happen via Helm as the only option.

Helm is also used to represent complete environments (the Helm umbrella pattern). The GIT repositories that back each environment are based on Helm charts.

We love the fact that Jenkins X has adopted Helm, but making it the only deployment option is a very aggressive stance for organizations that want to deploy legacy applications as well.

Representing an environment with a Helm umbrella chart is a good practice but in Codefresh this is just one of the many ways that you can do deployments.

Jenkins X enforces trunk based development, Codefresh allows any git workflow

From the previous sections, it should become clear that Jenkins X is a very opinionated solution that has a strong preference on how deployments are handled.

The problem is that these strong opinions also extend to how development happens during the integration phase. Jenkins X is designed around trunk based development. The mainline branch is what is always deployed and merging a pull request also implies a new release.

Trunk-based development is not bad on its own, but again there several organizations that have selected other strategies which are better suited for their needs. The ever-popular gitflow paradigm might be losing popularity in the last years, but in some cases, it really is a better solution. Jenkins X does not support it at all.

There are several organizations where even the concept of a single “production” branch might not exist at all. In some cases, there are several production branches (i.e. where releases are happening from) and adopting them in Jenkins X would be a difficult if not impossible task.

Codefresh does not enforce a specific git methodology. You can use any git workflow you want.

Example of Codefresh pipelineExample of Codefresh pipeline

A similar situation occurs with versioning. Jenkins X is designed around semantic versioning of Git tags. Codefresh does not enforce any specific versioning pattern.

In summary, with Codefresh you are free to choose your own workflow. With Jenkins X there is only a single way of doing things.

Jenkins X has no Graphical interface, Codefresh offers built-in GUI dashboards

Jenkins X does not have a UI on its own. The only UI present is the one from Jenkins which, as we have already explained, knows only about jobs and builds. In the case of a headless install, not even that UI is available.

This means that all the new Jenkins X constructs such as deployments, applications, and environments are only available in the command line.

The command line is great for developers and engineers who want to manage Jenkins X, but very inflexible when it comes to getting a general overview of everything that is happening. If Jenkins X is installed in a big organization, several non-developers (e.g. project manager, QA lead) will need an easy way to see what their team is doing.

Unfortunately, at its present state, only the JX executable offers a view of the Jenkins X flows via the command line.

Codefresh has a full UI for both CI and CD parts of the software lifecycle that includes everything in a single place. There are graphical dashboards for:

  • Git Repos
  • Pipelines
  • Builds
  • Docker images
  • Helm repository
  • Helm releases
  • Kubernetes services

It is very easy to get the full story of a feature from commit until it reaches production as well as understand what is deployed where.

Enteprise Helm promotion boardEnteprise Helm promotion board

The UI offered from Codefresh is targeted at all stakeholders that take part in the software delivery process.

Jenkins X uses Groovy/JX pipelines, Codefresh uses declarative YAML

We already mentioned that Jenkins X is using plain Jenkins under the hood. This means that pipelines in Jenkins X are also created with Groovy and shared libraries. Codefresh, on the other hand, uses declarative YAML.

The big problem here is that the jx executable (which is normally used to manage Jenkins X installation) can also be injected into Jenkins pipelines by extending their pipeline steps. This means that pipelines are now even more complicated as one must also learn how the jx executable works and how it affects the pipelines it takes part in.

Here is an official example from the quick start of Jenkins X (this is just a segment of the full pipeline)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

steps {

container(‘maven’) {

// ensure we’re not on a detached head

sh “git checkout master”

sh “git config –global credential.helper store”

sh “jx step git credentials”

// so we can retrieve the version in later steps

sh “echo $(jx-release-version) > VERSION”

sh “mvn versions:set -DnewVersion=$(cat VERSION)”

sh “jx step tag –version $(cat VERSION)”

sh “mvn clean deploy”

sh “export VERSION=`cat VERSION` && skaffold build -f skaffold.yaml”

sh “jx step post build –image $DOCKER_REGISTRY/$ORG/$APP_NAME:$(cat VERSION)”

}

}

You can see in the example above that jx now takes place in the Jenkins X pipelines requiring developers to learn yet another tool.

The problem here is that for Jenkins X to do its magic your pipelines must behave as they are expected to. Therefore modifying Jenkins X pipelines becomes extra difficult as you must honor the assumptions already present in the pipeline (that match the opinionated design of Jenkins X).

This problem does not exist in Codefresh. You can have a single pipeline that does everything with conditionals or multiple independent pipelines, or linked-pipelines or parallel pipelines, or any other pattern that you wish for your project.

Jenkins X does not cache dependencies by default, Codefresh has automatic caching

JenkinsX is scaling by creating additional builders on the Kubernetes cluster where it is installed. When you start a new pipeline, a new builder is created dynamically on the cluster that contains a Jenkins slave which is automatically connected to the Jenkins master. Jenkins X also supports a serverless installation where there is no need for a Jenkins master at all (which normally consumes resources even when no builds are running).

While this approach is great for scalability it is also ineffective when it comes to build caching. Each build node starts in a fresh state without any knowledge of previous builds. This means that all module dependencies needed by your programming language are downloaded again and again all the time.

A particularly bad example of this is the quick start demo offered by Jenkins. It is a Java application that downloads its Maven dependencies:

  1. Whenever a branch is built
  2. When a pull request is created
  3. When a pull request is merged back to master

Basically any Jenkins X build will download all dependencies each time it runs.

The builders in all 3 cases are completely isolated. For big projects (think also node modules, pip packages, ruby gems etc) where the dependencies actually dominate the compile time, this problem can quickly get out of hand with very slow Jenkins X builds.

Codefresh solves this problem by attaching a Docker volume in all steps on the pipeline. This volume is cached between subsequent builds so that dependencies are only downloaded once.

Codefresh caches the shared volume Codefresh caches the shared volume

In summary, Codefresh has much better caching than JenkinsX resulting in very fast builds and quick feedback times from commit to result.

Conclusion

In this article, we have seen some of the shortcomings of Jenkins X. Most of them center around the opinionated workflow & design decisions behind JenkinsX.

The biggest drawback, of course, is that JenkinsX also inherits all problems of Jenkins and in some cases (e.g. pipeline syntax) it makes things more complicated.

Finally, at its present state, Jenkins X has only a command line interface, making the visualization of its environment and application a very difficult process.

In the following table, we summarize the characteristics of JenkinsX vs Codefresh:

Feature Jenkins X Codefresh
Git repository dashboard No Yes
Git support Yes Yes
Quay/ACR/JFrog/Dockerhub triggers No Yes
GIT flow Trunk-based Any
Versioning Git tags /semantic Any
Pipeline Management CLI Graphical and CLI
Built-in dynamic test environments Yes Yes
Docker-based builds Yes Yes
Native build caching No Yes
Pipelines as code Groovy Yaml
Native Monorepo support No Yes
Extension mechanism Groovy shared libraries Docker images
Installation Cloud/On-prem Cloud/On-prem/Hybrid
Internal Docker registry Yes Yes
Docker Registry dashboard No Yes
Custom Docker image metadata No Yes
Native Kubernetes deployment Yes Yes
Kubernetes Release Dashboard No Yes
Deployment mode Helm only Helm or plain K8s
Integrated Helm repository Yes Yes
Helm app dashboard Yes Yes
Helm release dashboard No Yes
Helm releases history and management (UI) No Yes
Helm Rollback to any previous version (UI) No Yes
Deploy to Bare Metal/VM No Yes
Deployment management CLI GUI and CLI

New to Codefresh? Create Your Free Account Today!

Source

Securing a Containerized Instance of MongoDB

Securing
MongoDBMongoDB, the popular
open source NoSQL database, has been in the news a lot recently—and
not for reasons that are good for MongoDB admins. Early this year,
reports began
appearing

of MongoDB databases being “taken hostage” by attackers who delete all
of the data stored inside the databases, then demand ransoms to restore
it. Security is always important, no matter which type of database
you’re using. But the recent spate of MongoDB attacks makes it
especially crucial to secure any MongoDB databases that you may use as
part of your container stack. This article explains what you need to
know to keep MongoDB secure when it is running as a container. We’ll go
over how to close the vulnerability behind the recent ransomware attacks
using a MongoDB container while the container is running—as well as
how to modify a MongoDB Dockerfile to change the default behavior
permanently.

Why the Ransomware Happened: MongoDB’s Default Security Configuration


Register now for free online training on deploying containers with
Rancher The ransomware attacks against MongoDB weren’t
enabled by a flaw inherent in MongoDB itself, per se, but rather by some
weaknesses that result from default configuration parameters in an
out-of-the-box installation of MongoDB. By default, MongoDB databases,
unlike most other popular database platforms, don’t require login
credentials. That means anyone can log into the database and start
modifying or removing data. Securing a MongoDB Container In order to
mitigate that vulnerability and run a secure containerized instance of
MongoDB, follow the steps below. Start a MongoDB instance First of
all, start a MongoDB instance in Docker using the most up-to-date image
available. Since Docker uses the most recent image by default, a simple
command like this will start a MongoDB instance based on an up-to-date
image:

docker run –name mongo-database -d mongo

Create a secure MongoDB account Before disabling password-less
authentication to MongoDB, we need to create an account that we can use
to log in after we change the default settings. To do this, first log
into the MongoDB container with:

docker exec -it mongo-database bash

Then, from inside the container, log into the MongoDB admin interface:

Now, enter this stanza of code and press Enter:

use admin
db.createUser(
{
user: “db_user”,
pwd: “your_super_secure_password”,
roles: [ { role: “userAdminAnyDatabase”, db: “admin” } ]
}
)

This creates a MongoDB user with username db_user and password
your_super_secure_password (Feel free to change this, of course, to
something more secure!) The user has admin privileges. Changing
default behavior in Dockerfile If you want to make the MongoDB process
start with authentication required by default all of the time, you can
do so by editing the Dockerfile used to build the container. To do this
locally, we’ll first pull the MongoDB Dockerfile from GitHub with:

git clone https://github.com/dockerfile/mongodb

Now, cd into the mongodb directory that Git just created and open the
Dockerfile inside using your favorite text editor. Look for the
following section of the Dockerfile:

# Define default command.
CMD [“mongod”]

Change this to:

# Define default command.
CMD [“mongod –auth”]

This way, when mongodb is called when the container starts, it will run
with the –auth flag by default.

Conclusion

If you follow the steps above, you’ll be able to run MongoDB as a Docker
container without becoming one of the tens of thousands of admins whose
MongoDB databases were wiped out and held for ransom by attackers. And
there really is not much to it, other than being aware of the
vulnerabilities inherent in a default MongoDB installation and the steps
for resolving them. Chris Riley (@HoardingInfo) is a technologist who
has spent 12 years helping organizations transition from traditional
development practices to a modern set of culture, processes and tooling.
In addition to being a research analyst, he is an O’Reilly author,
regular speaker, and subject matter expert in the areas of DevOps
strategy and culture. Chris believes the biggest challenges faced in the
tech market are not tools, but rather people and planning.

Source

Top 5 Blog Posts of 2018: Introducing the New Docker Hub

Today, we’re excited to announce that Docker Store and Docker Cloud are now part of Docker Hub, providing a single experience for finding, storing and sharing container images. This means that:

  • Docker Certified and Verified Publisher Images are now available for discovery and download on Docker Hub
  • Docker Hub has a new user experience

Millions of individual users and more than a hundred thousand organizations use Docker Hub, Store and Cloud for their container content needs. We’ve designed this Docker Hub update to bring together the features that users of each product know and love the most, while addressing known Docker Hub requests around ease of use, repository and team management.

Here’s what’s new:

Repositories

  • View recently pushed tags and automated builds on your repository page
  • Pagination added to repository tags
  • Improved repository filtering when logged in on the Docker Hub home page

Organizations and Teams

  • As an organization Owner, see team permissions across all of your repositories at a glance.
  • Add existing Docker Hub users to a team via their email (if you don’t remember their of Docker ID)

New Automated Builds

  • Speed up builds using Build Caching
  • Add environment variables and run tests in your builds
  • Add automated builds to existing repositories

Note: For Organizations, GitHub & BitBucket account credentials will need to be re-linked to your organization to leverage the new automated builds. Existing Automated Builds will be migrated to this new system over the next few months. Learn more

Improved Container Image Search

  • Filter by Official, Verified Publisher and Certified images, guaranteeing a level of quality in the Docker images listed by your search query
  • Filter by categories to quickly drill down to the type of image you’re looking for

Existing URLs will continue to work, and you’ll automatically be redirected where appropriate. No need to update any bookmarks.

Verified Publisher Images and Plugins

Verified Publisher Images are now available on Docker Hub. Similar to Official Images, these images have been vetted by Docker. While Docker maintains the Official Images library, Verified Publisher and Certified Images are provided by our third-party software vendors. Interested vendors can sign up at https://goto.docker.com/Partner-Program-Technology.html.

Certified Images and Plugins

Certified Images are also now available on Docker Hub. Certified Images are a special category of Verified Publisher images that pass additional Docker quality, best practice, and support requirements.

  • Tested and supported on Docker Enterprise platform by verified publishers
  • Adhere to Docker’s container best practices
  • Pass a functional API test suite
  • Complete a vulnerability scanning assessment
  • Provided by partners with a collaborative support relationship
  • Display a unique quality mark “Docker Certified”

Source

DevOps and Containers, On-Prem or in the Cloud

The cloud vs.
on-premises debate is an old one. It goes back to the days when the
cloud was new and people were trying to decide whether to keep workloads
in on-premises datacenters or migrate to cloud hosts. But the Docker
revolution has introduced a new dimension to the debate. As more and
more organizations adopt containers, they are now asking themselves
whether the best place to host containers is on-premises or in the
cloud. As you might imagine, there’s no single answer that fits
everyone. In this post, we’ll consider the pros and cons of both cloud
and on-premises container deployment and consider which factors can make
one option or the other the right choice for your organization.

DevOps, Containers, and the Cloud

First, though, let’s take a quick look at the basic relationship
between DevOps, containers, and the cloud. In many ways, the combination
of DevOps and containers can be seen as one way—if not the native
way—of doing IT in the cloud. After all, containers maximize
scalability and flexibility, which are key goals of the DevOps
movement—not to mention the primary reasons for many people in
migrating to the cloud. Things like virtualization and continuous
delivery seem to be perfectly suited to the cloud (or to a cloud-like
environment), and it is very possible that if DevOps had originated in
the Agile world, it would have developed quite naturally out of the
process of adapting IT practices to the cloud.

DevOps and On-Premises

Does that mean, however, that containerization, DevOps, and continuous
delivery are somehow unsuited or even alien to on-premises deployment?
Not really. On-premises deployment itself has changed; it now has many
of the characteristics of the cloud, including a high degree of
virtualization, and relative independence from hardware constraints
through abstraction. Today’s on-premises systems generally fit the
definition of “private cloud,” and they lend themselves well to the
kind of automated development and operations cycle that lies at the
heart of DevOps. In fact, many of the major players in the
DevOps/container world, including AWS and Docker, provide strong support
for on-premises deployment, and sophisticated container management tools
such as Rancher are designed to work seamlessly across the
public/private cloud boundary. It is no exaggeration to say that
containers are now as native to the on-premises world as they are to the
cloud.

Why On-premises?

Why would you want to deploy containers on-premises? Local Resources
Perhaps the most obvious reason is the need to directly access and use
hardware features, such as storage, or processor-specific operations.
If, for example, you are using an array of graphics chips for
matrix-intensive computation, you are likely to be tied to local
hardware. Containers, like virtual machines, always require some degree
of abstraction, but running containers on-premises reduces the number of
layers of abstraction between the application and underlying metal to a
minimum. You can go from the container to the underlying OS’s hardware
access more or less directly—something which is not practical with VMs
on bare metal, or with containers in the public cloud. Local
Monitoring In a similar vein, you may also need containers to monitor,
control, and manage local devices. This may be an important
consideration in an industrial setting, or a research facility, for
example. It is, of course, possible to perform monitoring and control
functions with more traditional types of software—The combination of
containerization and continuous delivery, however, allows you to quickly
update and adapt software in response to changes in manufacturing
processes or research procedures. Local Control Over Security
Security may also be a major consideration when it comes to deploying
containers on-premises. Since containers access resources from the
underlying OS, they have potential security vulnerabilities; in order to
make containers secure, it is necessary to take positive steps to add
security features to container systems. Most container-deployment
systems have built-in security features. On-premises deployment,
however, may be a useful strategy for adding extra layers of security.
In addition to the extra security that comes with controlling access to
physical facilities, an on-premises container deployment may be able to
make use of the built-in security features of the underlying hardware.
Legacy Infrastructure and Cloud Migration What if you’re not in a
position to abandon existing on-premises infrastructure? If a company
has a considerable amount of money invested in hardware, or is simply
not willing or able to migrate away from a large and complex set of
interconnected legacy applications all at once, staying on-premises for
the time being may be the most practical (or the most politically
prudent) short-to-medium-term choice. By introducing containers (and
DevOps practices) on-premises, you can lay out a relatively painless
path for gradual migration to the cloud. Test Locally, Deploy in the
Cloud You may also want to develop and test containerized applications
locally, then deploy in the cloud. On-premises development allows you to
closely monitor the interaction between your software and the deployment
platform, and observe its operation under controlled conditions. This
can make it easier to isolate unanticipated post-deployment problems by
comparing the application’s behavior in the cloud with its behavior in
a known, controlled environment. It also allows you to deploy and test
container-based software in an environment where you can be confident
that information about new features or capabilities will not be leaked
to your competitors.

Public/Private Hybrid

Here’s another point to consider when you’re comparing cloud and
on-premises container deployment: public and private cloud deployment
are not fundamentally incompatible, and in many ways, there is really no
sharp line between them. This is, of course, true for traditional,
monolithic applications (which can, for example, also reside on private
servers while being accessible to remote users via a cloud-based
interface), but with containers, the public/private boundary can be made
even more fluid and indistinct when it is appropriate to do so. You can,
for example, deploy an application largely by means of containers in the
public cloud, with some functions running on on-premises containers.
This gives you granular control over things such as security or
local-device access, while at the same time allowing you to take
advantage of the flexibility, broad reach, and cost advantages of
public-cloud deployment.

The Right Mix for Your Organization

Which type of deployment is better for your company? In general,
startups and small-to-medium-size companies without a strong need to tie
in closely to hardware find it easy to move into (or start in) the
cloud. Larger (i.e. enterprise-scale) companies and those with a need to
manage and control local hardware resources are more likely to prefer
on-premises infrastructure. In the case of enterprises, on-premises
container deployment may serve as a bridge to full public-cloud
deployment, or hybrid private/public deployment. The bottom line,
however, is that the answer to the public cloud vs. on-premises question
really depends on the specific needs of your business. No two
organizations are alike, and no two software deployments are alike, but
whatever your software/IT goals are, and however you plan to achieve
them, between on-premises and public-cloud deployment, there’s more
than enough flexibility to make that plan work.

Source

Top 5 Post: Improved Docker Container Integration with Java 10

As 2018 comes to a close, we looked back at the top five blogs that were most popular with our readers. For those of you that had difficulties with memory and CPU sizing/usage when running Java Virtual Machine (JVM) in a container, we are kicking off the week with a blog that explains how to get improved Docker container integration with Java 10 in Docker Desktop ( Mac or Windows) and Docker Enterprise environments.

Docker and Java

Many applications that run in a Java Virtual Machine (JVM), including data services such as Apache Spark and Kafka and traditional enterprise applications, are run in containers. Until recently, running the JVM in a container presented problems with memory and cpu sizing and usage that led to performance loss. This was because Java didn’t recognize that it was running in a container. With the release of Java 10, the JVM now recognizes constraints set by container control groups (cgroups). Both memory and cpu constraints can be used manage Java applications directly in containers, these include:

  • adhering to memory limits set in the container
  • setting available cpus in the container
  • setting cpu constraints in the container

Java 10 improvements are realized in both Docker Desktop ( Mac or Windows) and Docker Enterprise environments.

Container Memory Limits

Until Java 9 the JVM did not recognize memory or cpu limits set by the container using flags. In Java 10, memory limits are automatically recognized and enforced.

Java defines a server class machine as having 2 CPUs and 2GB of memory and the default heap size is ¼ of the physical memory. For example, a Docker Enterprise Edition installation has 2GB of memory and 4 CPUs. Compare the difference between containers running Java 8 and Java 10. First, Java 8:

docker container run -it -m512 –entrypoint bash openjdk:latest

$ docker-java-home/bin/java -XX:+PrintFlagsFinal -version | grep MaxHeapSize
uintx MaxHeapSize := 524288000
openjdk version “1.8.0_162”

The max heap size is 512M or ¼ of the 2GB set by the Docker EE installation instead of the limit set on the container to 512M. In comparison, running the same commands on Java 10 shows that the memory limit set in the container is fairly close to the expected 128M:

docker container run -it -m512M –entrypoint bash openjdk:10-jdk

$ docker-java-home/bin/java -XX:+PrintFlagsFinal -version | grep MaxHeapSize
size_t MaxHeapSize = 134217728
openjdk version “10” 2018-03-20

Setting Available CPUs

By default, each container’s access to the host machine’s CPU cycles is unlimited. Various constraints can be set to limit a given container’s access to the host machine’s CPU cycles. Java 10 recognizes these limits:

docker container run -it –cpus 2 openjdk:10-jdk
jshell> Runtime.getRuntime().availableProcessors()
$1 ==> 2

All CPUs allocated to Docker EE get the same proportion of CPU cycles. The proportion can be modified by changing the container’s CPU share weighting relative to the weighting of all other running containers. The proportion will only apply when CPU-intensive processes are running. When tasks in one container are idle, other containers can use the leftover CPU time. The actual amount of CPU time will vary depending on the number of containers running on the system. These can be set in Java 10:

docker container run -it –cpu-shares 2048 openjdk:10-jdk
jshell> Runtime.getRuntime().availableProcessors()
$1 ==> 2

The cpuset constraint sets which CPUs allow execution in Java 10.

docker run -it –cpuset-cpus=”1,2,3″ openjdk:10-jdk
jshell> Runtime.getRuntime().availableProcessors()
$1 ==> 3

Allocating memory and CPU

With Java 10, container settings can be used to estimate the allocation of memory and CPUs needed to deploy an application. Let’s assume that the memory heap and CPU requirements for each process running in a container has already been determined and JAVA_OPTS set. For example, if you have an application distributed across 10 nodes; five nodes require 512Mb of memory with 1024 CPU-shares each and another five nodes require 256Mb with 512 CPU-shares each. Note that 1 CPU share proportion is represented by 1024.

For memory, the application would need 5Gb allocated at minimum.

512Mb x 5 = 2.56 Gb

256Mb x 5 = 1.28 Gb

The application would require 8 CPUs to run efficiently.

1024 x 5 = 5 CPUs

512 x 5 = 3 CPUs

Best practice suggests profiling the application to determine the memory and CPU allocations for each process running in the JVM. However, Java 10 removes the guesswork when sizing containers to prevent out of memory errors in Java applications as well allocating sufficient CPU to process work loads.

Source