Resizing Persistent Volumes using Kubernetes

Author: Hemant Kumar (Red Hat)

Editor’s note: this post is part of a series of in-depth articles on what’s new in Kubernetes 1.11

In Kubernetes v1.11 the persistent volume expansion feature is being promoted to beta. This feature allows users to easily resize an existing volume by editing the PersistentVolumeClaim (PVC) object. Users no longer have to manually interact with the storage backend or delete and recreate PV and PVC objects to increase the size of a volume. Shrinking persistent volumes is not supported.

Volume expansion was introduced in v1.8 as an Alpha feature, and versions prior to v1.11 required enabling the feature gate, ExpandPersistentVolumes, as well as the admission controller, PersistentVolumeClaimResize (which prevents expansion of PVCs whose underlying storage provider does not support resizing). In Kubernetes v1.11+, both the feature gate and admission controller are enabled by default.

Although the feature is enabled by default, a cluster admin must opt-in to allow users to resize their volumes. Kubernetes v1.11 ships with volume expansion support for the following in-tree volume plugins: AWS-EBS, GCE-PD, Azure Disk, Azure File, Glusterfs, Cinder, Portworx, and Ceph RBD. Once the admin has determined that volume expansion is supported for the underlying provider, they can make the feature available to users by setting the allowVolumeExpansion field to true in their StorageClass object(s). Only PVCs created from that StorageClass will be allowed to trigger volume expansion.

~> cat standard.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
parameters:
type: pd-standard
provisioner: kubernetes.io/gce-pd
allowVolumeExpansion: true
reclaimPolicy: Delete

Any PVC created from this StorageClass can be edited (as illustrated below) to request more space. Kubernetes will interpret a change to the storage field as a request for more space, and will trigger automatic volume resizing.

PVC StorageClass

File System Expansion

Block storage volume types such as GCE-PD, AWS-EBS, Azure Disk, Cinder, and Ceph RBD typically require a file system expansion before the additional space of an expanded volume is usable by pods. Kubernetes takes care of this automatically whenever the pod(s) referencing your volume are restarted.

Network attached file systems (like Glusterfs and Azure File) can be expanded without having to restart the referencing Pod, because these systems do not require special file system expansion.

File system expansion must be triggered by terminating the pod using the volume. More specifically:

  • Edit the PVC to request more space.
  • Once underlying volume has been expanded by the storage provider, then the PersistentVolume object will reflect the updated size and the PVC will have the FileSystemResizePending condition.

You can verify this by running kubectl get pvc <pvc_name> -o yaml

~> kubectl get pvc myclaim -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: myclaim
namespace: default
uid: 02d4aa83-83cd-11e8-909d-42010af00004
spec:
accessModes:
– ReadWriteOnce
resources:
requests:
storage: 14Gi
storageClassName: standard
volumeName: pvc-xxx
status:
capacity:
storage: 9G
conditions:
– lastProbeTime: null
lastTransitionTime: 2018-07-11T14:51:10Z
message: Waiting for user to (re-)start a pod to finish file system resize of
volume on node.
status: “True”
type: FileSystemResizePending
phase: Bound

  • Once the PVC has the condition FileSystemResizePending then pod that uses the PVC can be restarted to finish file system resizing on the node. Restart can be achieved by deleting and recreating the pod or by scaling down the deployment and then scaling it up again.
  • Once file system resizing is done, the PVC will automatically be updated to reflect new size.

Any errors encountered while expanding file system should be available as events on pod.

Online File System Expansion

Kubernetes v1.11 also introduces an alpha feature called online file system expansion. This feature enables file system expansion while a volume is still in-use by a pod. Because this feature is alpha, it requires enabling the feature gate, ExpandInUsePersistentVolumes. It is supported by the in-tree volume plugins GCE-PD, AWS-EBS, Cinder, and Ceph RBD. When this feature is enabled, pod referencing the resized volume do not need to be restarted. Instead, the file system will automatically be resized while in use as part of volume expansion. File system expansion does not happen until a pod references the resized volume, so if no pods referencing the volume are running file system expansion will not happen.

How can I learn more?

Check out additional documentation on this feature here: http://k8s.io/docs/concepts/storage/persistent-volumes.

Source

The Role of Enterprise Container Platforms

As container technology adoption continues to advance and mature, companies now recognize the importance of an enterprise container platform. More than just a runtime for applications, a container platform provides a complete management solution for securing and operationalizing applications in containers at scale over the entire software lifecycle.

While containers may have revolutionized the way developers package applications, container platforms are changing the way enterprises manage and secure both mission-critical legacy applications and microservices both on prem and across multiple clouds. Enterprises are beginning to see that container runtime and orchestration technologies alone don’t address these critical questions:

  • Where did this application come from?
  • Was the application built with company and/or industry best practices in mind?
  • Has this application undergone a security review?
  • Is my cluster performing as expected?
  • If my application is failing or underperforming, where should I look?
  • Will this environment run the same on the new hardware/cloud that we’re using?
  • Can I use my existing infrastructure and/or tools with this container environment?

Leading Industry Analysts Highlight Container Platforms for Enterprise Adoption

For some time, there was a lot of confusion in the market between orchestration solutions and container platforms. But in 2018, we are seeing more alignment across major industry analyst firms over the definition of container platforms. Today, Forrester published the Forrester New Wave™: Enterprise Container Platform Software Suites, Q4 2018, in which Docker was named a Leader.

This report is based on a multi-dimensional review of enterprise container platform solutions that go beyond runtime and orchestration, including:

  • Image management
  • Operations management
  • Security features
  • User experience
  • Application lifecycle management
  • Integrations and APIs
  • And more….

Download the full report here.

Enterprise Container Platforms: The Docker Approach

Docker is committed to delivering a container platform that is built on the values of choice, agility, and security. Docker Enterprise is now being used in over 650 organizations around the world, supporting a wide range of use cases and running on a variety of infrastructures including both private data centers and public clouds. Each of these customers are recognizing significant infrastructure cost reduction, operational efficiency and increased security as a result of containerization.

One key area of focus for us is being an enterprise solution for all applications – including Windows Server applications. While containers did originate with Linux, Windows Server applications represent more than half of all enterprise applications in use today. By partnering with Microsoft since our early days, we have been helping organizations containerize and operate Windows Server applications in production for over two years and counting. More importantly, the Docker container platform addresses both Windows Server and Linux applications – from image management to operations management, integrated security features to user experience and more.

We are honored to be recognized as a Leader in the Forrester New Wave report and look forward to working with more companies as they build out their container platform strategy.

To learn more about Docker Enterprise and the importance of container platforms:

container platform, docker enterprise, Forrester New Wave

Source

How the sausage is made: the Kubernetes 1.11 release interview, from the Kubernetes Podcast

Author: Craig Box (Google)

At KubeCon EU, my colleague Adam Glick and I were pleased to announce the Kubernetes Podcast from Google. In this weekly conversation, we focus on all the great things that are happening in the world of Kubernetes and Cloud Native. From the news of the week, to interviews with people in the community, we help you stay up to date on everything Kubernetes.

We recently had the pleasure of speaking to the release manager for Kubernetes 1.11, Josh Berkus from Red Hat, and the release manager for the upcoming 1.12, Tim Pepper from VMware.

In this conversation we learned about the release process, the impact of quarterly releases on end users, and how Kubernetes is like baking.

I encourage you to listen to the podcast version if you have a commute, or a dog to walk. If you like what you hear, we encourage you to subscribe! In case you’re short on time, or just want to browse quickly, we are delighted to share the transcript with you.

CRAIG BOX: First of all, congratulations both, and thank you.

JOSH BERKUS: Well, thank you. Congratulations for me, because my job is done.

[LAUGHTER]

Congratulations and sympathy for Tim.

[LAUGH]

TIM PEPPER: Thank you, and I guess thank you?

[LAUGH]

ADAM GLICK: For those that don’t know a lot about the process, why don’t you help people understand — what is it like to be the release manager? What’s the process that a release goes through to get to the point when everyone just sees, OK, it’s released — 1.11 is available? What does it take to get up to that?

JOSH BERKUS: We have a quarterly release cycle. So every three months, we’re releasing. And ideally and fortunately, this is actually now how we are doing things. Somewhere around two, three weeks before the previous release, somebody volunteers to be the release lead. That person is confirmed by SIG Release. So far, we’ve never had more than one volunteer, so there hasn’t been really a fight about it.

And then that person starts working with others to put together a team called the release team. Tim’s just gone through this with Stephen Augustus and picking out a whole bunch of people. And then after or a little before— probably after, because we want to wait for the retrospective from the previous release— the release lead then sets a schedule for the upcoming release, as in when all the deadlines will be.

And this is a thing, because we’re still tinkering with relative deadlines, and how long should code freeze be, and how should we track features? Because we don’t feel that we’ve gotten down that sort of cadence perfectly yet. I mean, like, we’ve done pretty well, but we don’t feel like we want to actually set [in stone], this is the schedule for each and every release.

Also, we have to adjust the schedule because of holidays, right? Because you can’t have the code freeze deadline starting on July 4 or in the middle of design or sometime else when we’re going to have a large group of contributors who are out on vacation.

TIM PEPPER: This is something I’ve had to spend some time looking at, thinking about 1.12. Going back to early June as we were tinkering with the code freeze date, starting to think about, well, what are the implications going to be on 1.12? When would these things start falling on the calendar? And then also for 1.11, we had one complexity. If we slipped the release past this week, we start running into the US 4th of July holiday, and we’re not likely to get a lot done.

So much of a slip would mean slipping into the middle of July before we’d really know that we were successfully triaging things. And worst case maybe, we’re quite a bit later into July.

So instead of quarterly with a three-month sort of cadence, well, maybe we’ve accidentally ended up chopping out one month out of the next release or pushing it quite a bit into the end of the year. And that made the deliberation around things quite complex, but thankfully this week, everything’s gone smoothly in the end.

CRAIG BOX: All the releases so far have been one quarter — they’ve been a 12-week release cycle, give or take. Is that something that you think will continue going forward, or is the release team thinking about different ways they could run releases?

TIM PEPPER: The whole community is thinking about this. There are voices who’d like the cadence to be faster, and there are voices who’d like it to be shorter. And there’s good arguments for both.

ADAM GLICK: Because it’s interesting. It sounds like it is a date-driven release cycle versus a feature-driven release cycle.

JOSH BERKUS: Yeah, certainly. I really honestly think everybody in the world of software recognizes that feature-driven release cycles just don’t work. And a big part of the duties of the release team collectively— several members of the team do this— is yanking things out of the release that are not ready. And the hard part of that is figuring out which things aren’t ready, right? Because the person who’s working on it tends to be super optimistic about what they can get done and what they can get fixed before the deadline.

ADAM GLICK: Of course.

TIM PEPPER: And this is one of the things I think that’s useful about the process we have in place on the release team for having shadows who spend some time on the release team, working their way up into more of a lead position and gaining some experience, starting to get some exposure to see that optimism and see the processes for vetting.

And it’s even an overstatement to say the process. Just see the way that we build the intuition for how to vet and understand and manage the risk, and really go after and chase things down proactively and early to get resolution in a timely way versus continuing to just all be optimistic and letting things maybe languish and put a release at risk.

CRAIG BOX: I’ve been reading this week about the introduction of feature branches to Kubernetes. The new server-side apply feature, for example, is being built in a branch so that it didn’t have to be half-built in master and then ripped out again as the release approached, if the feature wasn’t ready. That seems to me like something that’s a normal part of software development? Is there a reason it’s taken so long to bring that to core Kubernetes?

JOSH BERKUS: I don’t actually know the history of why we’re not using feature branches. I mean, the reason why we’re not using feature branches pervasively now is that we have to transition from a different system. And I’m not really clear on how we adopted that linear development system. But it’s certainly something we discussed on the release team, because there were issues of features that we thought were going to be ready, and then developed major problems. And we’re like, if we have to back this out, that’s going to be painful. And we did actually have to back one feature out, which involved not pulling out a Git commit, but literally reversing the line changes, which is really not how you want to be doing things.

CRAIG BOX: No.

TIM PEPPER: The other big benefit, I think, to the release branches if they are well integrated with the CI system for continuous integration and testing, you really get the feedback, and you can demonstrate, this set of stuff is ready. And then you can do deferred commitment on the master branch. And what comes in to a particular release on the timely cadence that users are expecting is stuff that’s ready. You don’t have potentially destabilizing things, because you can get a lot more proof and evidence of readiness.

ADAM GLICK: What are you looking at in terms of the tool chain that you’re using to do this? You mentioned a couple of things, and I know it’s obviously run through GitHub. But I imagine you have a number of other tools that you’re using in order to manage the release, to make sure that you understand what’s ready, what’s not. You mentioned balancing between people who are very optimistic about the feature they’re working on making it in versus the time-driven deadline, and balancing those two. Is that just a manual process, or do you have a set of tools that help you do that?

JOSH BERKUS: Well, there’s code review, obviously. So just first of all, process was somebody wants to actually put in a feature, commit, or any kind of merge really, right? So that has to be assigned to one of the SIGs, one of these Special Interest Groups. Possibly more than one, depending on what areas it touches.

And then two generally overlapping groups of people have to approve that. One would be the SIG that it’s assigned to, and the second would be anybody represented in the OWNERS files in the code tree of the directories which get touched.

Now sometimes those are the same group of people. I’d say often, actually. But sometimes they’re not completely the same group of people, because sometimes you’re making a change to the network, but that also happens to touch GCP support and OpenStack support, and so they need to review it as well.

So the first part is the human part, which is a bunch of other people need to look at this. And possibly they’re going to comment “Hey. This is a really weird way to do this. Do you have a reason for it?”

Then the second part of it is the automated testing that happens, the automated acceptance testing that happens via webhook on there. And actually, one of the things that we did that was a significant advancement in this release cycle— and by we, I actually mean not me, but the great folks at SIG Scalability did— was add an additional acceptance test that does a mini performance test.

Because one of the problems we’ve had historically is our major performance tests are large and take a long time to run, and so by the time we find out that we’re failing the performance tests, we’ve already accumulated, you know, 40, 50 commits. And so now we’re having to do git bisect to find out which of those commits actually caused the performance regression, which can make them very slow to address.

And so adding that performance pre-submit, the performance acceptance test really has helped stabilize performance in terms of new commits. So then we have that level of testing that you have to get past.

And then when we’re done with that level of testing, we run a whole large battery of larger tests— end-to-end tests, performance tests, upgrade and downgrade tests. And one of the things that we’ve added recently and we’re integrating to the process something called conformance tests. And the conformance test is we’re testing whether or not you broke backwards compatibility, because it’s obviously a big deal for Kubernetes users if you do that when you weren’t intending to.

One of the busiest roles in the release team is a role called CI Signal. And it’s that person’s job just to watch all of the tests for new things going red and then to try to figure out why they went red and bring it to people’s attention.

ADAM GLICK: I’ve often heard what you’re referring to kind of called a breaking change, because it breaks the existing systems that are running. How do you identify those to people so when they see, hey, there’s a new version of Kubernetes out there, I want to try it out, is that just going to release notes? Or is there a special way that you identify breaking changes as opposed to new features?

JOSH BERKUS: That goes into release notes. I mean, keep in mind that one of the things that happens with Kubernetes’ features is we go through this alpha, beta, general availability phase, right? So a feature’s alpha for a couple of releases and then becomes beta for a release or two, and then it becomes generally available. And part of the idea of having this that may require a feature to go through that cycle for a year or more before its general availability is by the time it’s general availability, we really want it to be, we are not going to change the API for this.

However, stuff happens, and we do occasionally have to do those. And so far, our main way to identify that to people actually is in the release notes. If you look at the current release notes, there are actually two things in there right now that are sort of breaking changes.

One of them is the bit with priority and preemption in that preemption being on by default now allows badly behaved users of the system to cause trouble in new ways. I’d actually have to look at the release notes to see what the second one was…

TIM PEPPER: The JSON capitalization case sensitivity.

JOSH BERKUS: Right. Yeah. And that was one of those cases where you have to break backwards compatibility, because due to a library switch, we accidentally enabled people using JSON in a case-insensitive way in certain APIs, which was never supposed to be the case. But because we didn’t have a specific test for that, we didn’t notice that we’d done it.

And so for three releases, people could actually shove in malformed JSON, and Kubernetes would accept it. Well, we have to fix that now. But that does mean that there are going to be users out in the field who have malformed JSON in their configuration management that is now going to break.

CRAIG BOX: But at least the good news is Kubernetes was always outputting correct formatted JSON during this period, I understand.

JOSH BERKUS: Mm-hmm.

TIM PEPPER: I think that also kind of reminds of one of the other areas— so kind of going back to the question of, well, how do you share word of breaking changes? Well, one of the ways you do that is to have as much quality CI that you can to catch these things that are important. Give the feedback to the developer who’s making the breaking change, such that they don’t make the breaking change. And then you don’t actually have to communicate it out to users.

So some of this is bound to happen, because you always have test escapes. But it’s also a reminder of the need to ensure that you’re also really building and maintaining your test cases and the quality and coverage of your CI system over time.

ADAM GLICK: What do you mean when you say test escapes?

TIM PEPPER: So I guess it’s a term in the art, but for those who aren’t familiar with it, you have intended behavior that wasn’t covered by test, and as a result, an unintended change happens to that. And instead of your intended behavior being shipped, you’re shipping something else.

JOSH BERKUS: The JSON change is a textbook example of this, which is we were testing that the API would continue to accept correct JSON. We were not testing adequately that it wouldn’t accept incorrect JSON.

TIM PEPPER: A test escape, another way to think of it as you shipped a bug because there was not a test case highlighting the possibility of the bug.

ADAM GLICK: It’s the classic, we tested to make sure the feature worked. We didn’t test to make sure that breaking things didn’t work.

TIM PEPPER: It’s common for us to focus on “I’ve created this feature and I’m testing the positive cases”. And this also comes to thinking about things like secure by default and having a really robust system. A harder piece of engineering often is to think about the failure cases and really actively manage those well.

JOSH BERKUS: I had a conversation with a contributor recently where it became apparent that contributor had never worked on a support team, because their conception of a badly behaved user was, like, a hacker, right? An attacker who comes from outside.

And I’m like, no, no, no. You’re stable of badly behaved users is your own staff. You know, they will do bad things, not necessarily intending to do bad things, but because they’re trying to take a shortcut. And that is actually your primary concern in terms of preventing breaking the system.

CRAIG BOX: Josh, what was your preparation to be release manager for 1.11?

JOSH BERKUS: I was on the release team for two cycles, plus I was kind of auditing the release team for half a cycle before that. So in 1.9, I originally joined to be the shadow for bug triage, except I ended up not being the shadow, because the person who was supposed to be the lead for bug triage then dropped out. Then I ended up being the bug triage lead, and had to kind of improvise it because there wasn’t documentation on what was involved in the role at the time.

And then I was bug triage lead for a second cycle, for the 1.10 cycle, and then took over as release lead for the cycle. And one of the things on my to-do list is to update the requirements to be release lead, because we actually do have written requirements, and to say that the expectation now is that you spend at least two cycles on the release team, one of them either as a lead or as a shadow to the release lead.

CRAIG BOX: And is bug triage lead just what it sounds like?

JOSH BERKUS: Yeah. Pretty much. There’s more tracking involved than triage. Part of it is just deficiencies in tooling, something we’re looking to address. But things like GitHub API limitations make it challenging to build automated tools that help us intelligently track issues. And we are actually working with GitHub on that. Like, they’ve been helpful. It’s just, they have their own scaling problems.

But then beyond that, you know, a lot of that, it’s what you would expect it to be in terms of what triage says, right? Which is looking at every issue and saying, first of all, is this a real issue? Second, is it a serious issue? Third, who needs to address this?

And that’s a lot of the work, because for anybody who is a regular contributor to Kubernetes, the number of GitHub notifications that they receive per day means that most of us turn our GitHub notifications off.

CRAIG BOX: Indeed.

JOSH BERKUS: Because it’s just this fire hose. And as a result, when somebody really needs to pay attention to something right now, that generally requires a human to go and track them down by email or Slack or whatever they prefer. Twitter in some cases. I’ve done that. And say, hey. We really need you to look at this issue, because it’s about to hold up the beta release.

ADAM GLICK: When you look at the process that you’re doing now, what are the changes that are coming in the future that will make the release process even better and easier?

JOSH BERKUS: Well, we just went through this whole retro, and I put in some recommendations for things. Obviously, some additional automation, which I’m going to be looking at doing now that I’m cycling off of the release team for a quarter and can actually look at more longer term goals, will help, particularly now that we’ve addressed actually some of our GitHub data flow issues.

Beyond that, I put in a whole bunch of recommendations in the retro, but it’s actually up to Tim which recommendations he’s going to try to implement. So I’ll let him [comment].

TIM PEPPER: I think one of the biggest changes that happened in the 1.11 cycle is this emphasis on trying to keep our continuous integration test status always green. That is huge for software development and keeping velocity. If you have this more, I guess at this point antiquated notion of waterfall development, where you do feature development for a while and are accepting of destabilization, and somehow later you’re going to come back and spend a period on stabilization and fixing, that really elongates the feedback loop for developers.

And they don’t realize what was broken, and the problems become much more complex to sort out as time goes by. One, developers aren’t thinking about what it was that they’d been working on anymore. They’ve lost the context to be able to efficiently solve the problem.

But then you start also getting interactions. Maybe a bug was introduced, and other people started working around it or depending on it, and you get complex dependencies then that are harder to fix. And when you’re trying to do that type of complex resolution late in the cycle, it becomes untenable over time. So I think continuing on that and building on it, I’m seeing a little bit more focus on test cases and meaningful test coverage. I think that’s a great cultural change to have happening.

And maybe because I’m following Josh into this role from a bug triage position and in his mentions earlier of just the communications and tracking involved with that versus triage, I do have a bit of a concern that at times, email and Slack are relatively quiet. Some of the SIG meeting notes are a bit sparse or YouTube videos slow to upload. So the general artifacts around choice making I think is an area where we need a little more rigor. So I’m hoping to see some of that.

And that can be just as subtle as commenting on issues like, hey, this commit doesn’t say what it’s doing. And for that reason on the release team, we can’t assess its risk versus value. So could you give a little more information here? Things like that give more information both to the release team and the development community as well, because this is open source. And to collaborate, you really do need to communicate in depth.

CRAIG BOX: Speaking of cultural changes, professional baker to Kubernetes’ release lead sounds like quite a journey.

JOSH BERKUS: There was a lot of stuff in between.

CRAIG BOX: Would you say there are a lot of similarities?

JOSH BERKUS: You know, believe it or not, there actually are similarities. And here’s where it’s similar, because I was actually thinking about this earlier. So when I was a professional baker, one of the things that I had to do was morning pastry. Like, I was actually in charge of doing several other things for custom orders, but since I had to come to work at 3:00 AM anyway— which also distressingly has similarities with some of this process. Because I had to come to work at 3:00 AM anyway, one of my secondary responsibilities was traying the morning pastry.

And one of the parts of that is you have this great big gas-fired oven with 10 rotating racks in it that are constantly rotating. Like, you get things in and out in the oven by popping them in and out while the racks are moving. That takes a certain amount of skill. You get burn marks on your wrists for your first couple of weeks of work. And then different pastries require a certain number of rotations to be done.

And there’s a lot of similarities to the release cadence, because what you’re doing is you’re popping something in the oven or you’re seeing something get kicked off, and then you have a certain amount of time before you need to check on it or you need to pull it out. And you’re doing that in parallel with a whole bunch of other things. You know, with 40 other trays.

CRAIG BOX: And with presumably a bunch of colleagues who are all there at the same time.

JOSH BERKUS: Yeah. And the other thing is that these deadlines are kind of absolute, right? You can’t say, oh, well, I was reading a magazine article, and I didn’t have time to pull that tray out. It’s too late. The pastry is burned, and you’re going to have to throw it away, and they’re not going to have enough pastry in the front case for the morning rush. And the customers are not interested in your excuses for that.

So from that perspective, from the perspective of saying, hey, we have a bunch of things that need to happen in parallel, they have deadlines and those deadlines are hard deadlines, there it’s actually fairly similar.

CRAIG BOX: Tim, do you have any other history that helped get you to where you are today?

TIM PEPPER: I think in some ways I’m more of a traditional journey. I’ve got a computer engineering bachelor’s degree. But I’m also maybe a bit of an outlier. In the late ‘90s, I found a passion for open source and Linux. Maybe kind of an early adopter, early believer in that.

And was working in the industry in the Bay Area for a while. Got involved in the Silicon Valley and Bay Area Linux users groups a bit, and managed to find work as a Linux sysadmin, and then doing device driver and kernel work and on up into distro. So that was all kind of standard in a way. And then I also did some other work around hardware enablement, high-performance computing, non-uniform memory access. Things that are really, really systems work.

And then about three years ago, my boss was really bending my ear and trying to get me to work on this cloud-related project. And that just felt so abstract and different from the low-level bits type of stuff that I’d been doing.

But kind of grudgingly, I eventually came around to the realization that the cloud is interesting, and it’s so much more complex than local machine-only systems work, the type of things that I’d been doing before. It’s massively distributed and you have a high-latency, low-reliability interconnect on all the nodes in the distributed network. So it’s wildly complex engineering problems that need solved.

And so that got me interested. Started working then on this open source orchestrator for virtual machines and containers. It was written in Go and was having a lot of fun. But it wasn’t Kubernetes, and it was becoming clear that Kubernetes was taking off. So about a year ago, I made the deliberate choice to move over to Kubernetes work.

ADAM GLICK: Previously, Josh, you spoke a little bit about your preparation for becoming a release manager. For other folks that are interested in getting involved in the community and maybe getting involved in release management, should they follow the same path that you did? Or what are ways that would be good for them to get involved? And for you, Tim, how you’ve approached the preparation for taking on the next release.

JOSH BERKUS: The great thing with the release team is that we have this formal mentorship path. And it’s fast, right? That’s the advantage of releasing quarterly, right? Is that within six months, you can go from joining the team as a shadow to being the release lead if you have the time. And you know, by the time you work your way up to release time, you better have support from your boss about this, because you’re going to end up spending a majority of your work time towards the end of the release on release management.

So the answer is to sign up to look when we’re getting into the latter half of release cycle, to sign up as a shadow. Or at the beginning of a release cycle, to sign up as a shadow. Some positions actually can reasonably use more than one shadow. There’s some position that just require a whole ton of legwork like release notes. And as a result, could actually use more than one shadow meaningfully. So there’s probably still places where people could sign up for 1.12. Is that true, Tim?

TIM PEPPER: Definitely. I think— gosh, right now we have 34 volunteers on the release team, which is—

ADAM GLICK: Wow.

JOSH BERKUS: OK. OK. Maybe not then.

[LAUGH]

TIM PEPPER: It’s potentially becoming a lot of cats to herd. But I think even outside of that formal volunteering to be a named shadow, anybody is welcome to show up to the release team meetings, follow the release team activities on Slack, start understanding how the process works. And really, this is the case all across open source. It doesn’t even have to be the release team. If you’re passionate about networking, start following what SIG Network is doing. It’s the same sort of path, I think, into any area on the project.

Each of the SIGs [has] a channel. So it would be #SIG-whatever the name is. [In our] case, #SIG-Release.

I’d also maybe give a plug for a talk I did at KubeCon in Copenhagen this spring, talking about how the release team specifically can be a path for new contributors coming in. And had some ideas and suggestions there for newcomers.

CRAIG BOX: There’s three questions in the Google SRE postmortem template that I really like. And I’m sure you will have gone through these in the retrospective process as you released 1.11, so I’d like to ask them now one at a time.

First of all, what went well?

JOSH BERKUS: Two things, I think, really improved things, both for contributors and for the release team. Thing number one was putting a strong emphasis on getting the test grid green well ahead of code freeze.

TIM PEPPER: Definitely.

JOSH BERKUS: Now partly that went well because we had a spectacular CI lead, Aish Sundar, who’s now in training to become the release lead.

TIM PEPPER: And I’d count that partly as one of the “Where were you lucky?” areas. We happened upon a wonderful person who just popped up and volunteered.

JOSH BERKUS: Yes. And then but part of that was also that we said, hey. You know, we’re not going to do what we’ve done before which is not really care about these tests until code slush. We’re going to care about these tests now.

And importantly— this is really important to the Kubernetes community— when we went to the various SIGs, the SIG Cluster Lifecycle and SIG Scalability and SIG Node and the other ones who were having test failures, and we said this to them. They didn’t say, get lost. I’m busy. They said, what’s failing?

CRAIG BOX: Great.

JOSH BERKUS: And so that made a big difference. And the second thing that was pretty much allowed by the first thing was to shorten the code freeze period. Because the code freeze period is frustrating for developers, because if they don’t happen to be working on a 1.11 feature, even if they worked on one before, and they delivered it early in the cycle, and it’s completely done, they’re kind of paralyzed, and they can’t do anything during code freeze. And so it’s very frustrating for them, and we want to make that period as short as possible. And we did that this time, and I think it helped everybody.

CRAIG BOX: What went poorly?

JOSH BERKUS: We had a lot of problems with flaky tests. We have a lot of old tests that are not all that well maintained, and they’re testing very complicated things like upgrading a cluster that has 40 nodes. And as a result, these tests have high failure rates that have very little to do with any change in the code.

And so one of the things that happened, and the reason we had a one-day delay in the release is, you know, we’re a week out from release, and just by random luck of the draw, a bunch of these tests all at once got a run of failures. And it turned out that run of failures didn’t actually mean anything, having anything to do with Kubernetes. But there was no way for us to tell that without a lot of research, and we were not going to have enough time for that research without delaying the release.

So one of the things we’re looking to address in the 1.12 cycle is to actually move some of those flaky tests out. Either fix them or move them out of the release blocking category.

TIM PEPPER: In a way, I think this also highlights one of the things that Josh mentioned that went well, the emphasis early on getting the test results green, it allows us to see the extent to which these flakes are such a problem. And then the unlucky occurrence of them all happening to overlap on a failure, again, highlights that these flakes have been called out in the community for quite some time. I mean, at least a year. I know one contributor who was really concerned about them.

But they became a second order concern versus just getting things done in the short term, getting features and proving that the features worked, and kind of accepting in a risk management way on the release team that, yes, those are flakes. We don’t have time to do something about them, and it’s OK. But because of the emphasis on keeping the test always green now, we have the luxury maybe to focus on improving these flakes, and really get to where we have truly high quality CI signal, and can really believe in the results that we have on an ongoing basis.

JOSH BERKUS: And having solved some of the more basic problems, we’re now seeing some of the other problems like coordination between related features. Like we right now have a feature where— and this is one of the sort of backwards compatibility release notes— where the feature went into beta, and is on by default.

And the second feature that was supposed to provide access control for the first feature did not go in as beta, and is not on by default. And the team for the first feature did not realize the second feature was being held up until two days before the release. So it’s going to result in us actually patching something in 11.1.

And so like, we put that into something that didn’t go well. But on the other hand, as Tim points out, a few release cycles ago, we wouldn’t even have identified that as a problem, because we were still struggling with just individual features having a clear idea well ahead of the release of what was going in and what wasn’t going in.

TIM PEPPER: I think something like this also is a case that maybe advocates for the use of feature branches. If these things are related, we might have seen it and done more pre-testing within that branch and pre-integration, and decide maybe to merge a couple of what initially had been disjoint features into a single feature branch, and really convince ourselves that together they were good. And cross all the Ts, dot all the Is on them, and not have something that’s gated on an alpha feature that’s possibly falling away.

CRAIG BOX: And then the final question, which I think you’ve both touched on a little. Where did you get lucky, or unlucky perhaps?

JOSH BERKUS: I would say number one where I got lucky is truly having a fantastic team. I mean, we just had a lot of terrific people who were very good and very energetic and very enthusiastic about taking on their release responsibilities including Aish and Tim and Ben and Nick and Misty who took over Docs four weeks into the release. And then went crazy with it and said, well, I’m new here, so I’m going to actually change a bunch of things we’ve been doing that didn’t work in the first place. So that was number one. I mean, that really made honestly all the difference.

And then the second thing, like I said, is that we didn’t have sort of major, unexpected monkey wrenches thrown at us. So in the 1.10 cycle, we actually had two of those, which is why I still count Jace as heroic for pulling off a release that was only a week late.

You know, number one was having the scalability tests start failing for unrelated reasons for a long period, which then masked the fact that they were actually failing for real reasons when we actually got them working again. And as a result, ending up debugging a major and super complicated scalability issue within days of what was supposed to be the original release date. So that was monkey wrench number one for the 1.10 cycle.

Monkey wrench number two for the 1.10 cycle was we got a security hole that needed to be patched. And so again, a week out from what was supposed to be the original release date, we were releasing a security update, and that security update required patching the release branch. And it turns out that patch against the release branch broke a bunch of incoming features. And we didn’t get anything of that magnitude in the 1.11 release, and I’m thankful for that.

TIM PEPPER: Also, I would maybe argue in a way that a portion of that wasn’t just luck. The extent to which this community has a good team, not just the release team but beyond, some of this goes to active work that folks all across the project, but especially in the contributor experience SIG are doing to cultivate a positive and inclusive culture here. And you really see that. When problems crop up, you’re seeing people jump on and really try to constructively tackle them. And it’s really fun to be a part of that.

Thanks to Josh Berkus and Tim Pepper for talking to the Kubernetes Podcast from Google.

Josh Berkus hangs out in #sig-release on the Kubernetes Slack. He maintains a newsletter called “Last Week in Kubernetes Development”, with Noah Kantrowitz. You can read him on Twitter at @fuzzychef, but he does warn you that there’s a lot of politics there as well.

Tim Pepper is also on Slack – he’s always open to folks reaching out with a question, looking for help or advice. On Twitter you’ll find him at @pythomit, which is “Timothy P” backwards. Tim is an avid soccer fan and season ticket holder for the Portland Timbers and the Portland Thorns, so you’ll get all sorts of opinions on soccer in addition to technology!

You can find the Kubernetes Podcast from Google at @kubernetespod on Twitter, and you can subscribe so you never miss an episode.

Source

Automated certificate provisioning in Kubernetes using kube-lego – Jetstack Blog

By Christian Simon

In this blog post, we are pleased to introduce Kube-Lego, an open source tool for automated Let’s Encrypt TLS-enabled web services running in Kubernetes.

TLS has become increasingly important for production deployment of web services. This has been driven by revelations of surveillance post-Snowden, as well as the fact that Google now favours secure HTTPS sites in search result rankings.

An important step towards increased adoption of TLS has been the availability of
Let’s Encrypt. It provides an easy, free-of-charge way to obtain certificates. Certificates are limited to a 90-day lifetime and so the free certificate authority (CA) encourages full automation for ease-of-use. At the time of writing, Let’s Encrypt has approaching 3.5 million unexpired certificates so adoption has certainly been strong.

Kube-Lego automates the process in Kubernetes by watching ingress resources and automatically requesting missing or expired TLS certificates from Let’s Encrypt.

In order to automate the process of verification and certificate issuance for
Let’s Encrypt’s CA, the ACME (Automated Certificate Management Environment)
protocol is used. It specifies an API that can be integrated into many products
that require publicly trusted certificates.

To interact with the CA’s ACME server, clients are required to
authenticate with a private/public key pair (account). This helps to identify
the user later for actions like extension or revocation of certificates. Let’s
Encrypt supports only domain validation and requires you to specify every
valid domain individually, so while a certificate can be valid for multiple
hostnames using SAN, there is currently no support for wildcard certificates.

Validation methods

Let’s Encrypt allows you to prove the validity of a certificate request
with four methods. They all use a so-called ‘key auth challenge response’, which
is derived from the account’s key pair.

  • Simple HTTP: The CA connects to the specified URL
    (http://$/.well-known/acme-challenge/$) to verify the
    authenticity of a certificate request. The response of the HTTP server has to
    contain the key auth.
  • TLS-SNI: With this method, the CA connects to the requested domain
    name via HTTPS and selects the verification hostname
    $.$.acme.invalid via SNI. The returned certificate
    is not verified, it only has to contain the verification hostname.
  • DNS: A TXT-record _acme-challenge.$ has to be published,
    to verify the authenticity of your request via the DNS method. The content of
    this records has to include the key auth.
  • Proof of Possession of a Prior Key: If you already have a valid
    certificate for the domain name you want to request another certificate. You can then use this method to get validated.

Kube-Lego brings fully automated TLS management to a Kubernetes cluster.
To achieve this it interfaces with the Kubernetes API on one side and an ACME
enabled CA on the other. Kube-Lego is written in Go and uses xenolf’s
ACME client implementation Lego for communicating with Let’s
Encrypt (this explains the project name). Currently, the only
implemented validation method is Simple HTTP

Pre-requisites

To use Kube-Lego you need a working Kubernetes cluster. The minimum
version supported is 1.2, as this includes TLS support for ingress resources. There are plenty of ways getting Kubernetes
bootstrapped; for instance, take a look at this Getting Started
Guide
from the Kubernetes
project.

Note: Jetstack will also soon open source its cluster provisioner tool.

Another requirement for using Kube-Lego is a supported
ingress controller. The only supported controller at the moment is the
nginx-ingress-controller from Kubernetes’ contrib project. The current
release of the upstream controller needs a simple modification to fully support
Kube-Lego. There is already a pull
request
filed to integrate
this change into the next upstream release. Meanwhile you can use a modified
build of the
nginx-ingress-controller.

Before you can use Kube-Lego you have to make the nginx-ingress-controller pods
accessible publicly. This usually happens with a service resource of the type
LoadBalancer.
Depending on the environment the cluster is running, this will create an
ELB/Forwarding Rule and you can point the domains you wish to use to that entry
point into your cluster.

Validity check

After starting up, Kube-Lego looks at all ingress objects in all namespaces in the Kubernetes cluster. If the ingress is annotated with kubernetes.io/tls-acme: “true”, Kube-Lego will check the TLS configuration and make sure that the specified secret:

  • Exists and contains a valid private/public key pair;
  • The certificate is not expired;
  • The certificate covers all domain names specified in the ingress config.

Let’s take a look at the following example of an ingress resource:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: hello-world
annotations:
# enable kube-lego for this ingress
kubernetes.io/tls-acme: “true”
spec:
# this enables tls for the specified domain names
tls:
– hosts:
– demo.kube-lego.jetstack.net
secretName: hello-world-tls
rules:
– host: demo.kube-lego.jetstack.net
http:
paths:
– path: /
backend:
serviceName: hello-world
servicePort: 80

Certificate Request

Let’s assume we haven’t run Kube-Lego before, so neither the certificate nor
the user account exists. The Kube-Lego validity check comes to the conclusion
that it needs to request a certificate for the domain
demo.kube-lego.jetstack.net.

Before requesting the certificate, Kube-Lego sets up the challenge endpoint
(/.well-known/acme-challenge/) in a separate ingress resource named
kube-lego. This resource is meant to only be used by Kube-Lego and the endpoint
will be reachable over the public URL. This
makes sure that actual traffic can reach the cluster and we do not
unnecessarily try to validate with Let’s Encrypt.

Kube-Lego looks for the secret kube-lego-account; if it does not
exist, Kube-Lego creates it by registering with Let’s Encrypt. Finally, the
request for the certificate can be made with Let’s Encrypt. Kube-Lego
responds to the HTTP validation via the challenge endpoint and then finally
receives the certificate, which is stored into the secret hello-world-tls.

The following diagram illustrates this flow and the various interfaces.

Kube-Lego process

Kube-Lego process

Demo

If you want to run these examples, you can always find the latest
version on GitHub.

A short demo was also part of the Kubernetes Community Hangout on June 2nd. See the recording here.

A screencast of an extended demo can be found here:

Screencast

Future work

This is a very early project and does not cover all common use cases.
Feel free to report any issues and enhancements via GitHub
Issues
. You can also see some
already identified issues there.

Source

How to Deploy Datadog on Rancher 2.0

Expert Training in Kubernetes and Rancher

Join our free online training sessions to learn how to manage Kubernetes workloads with Rancher.

Sign up here

Datadog is a popular hosted monitoring solution for aggregating and analyzing metrics and events for distributed systems. From infrastructure integrations to collaborative dashboards, Datadog gives you a clean single pane view into the information that is most important to you. Leveraging Datadog with Rancher can then give you a full stack view of all of your applications running on Kubernetes clusters, wherever they are hosted. To make Datadog easy to use with Rancher 2.0, we have modified the Datadog Helm chart to make it a simple deployment through Rancher’s catalog feature that will function across Rancher projects within a cluster.

Prerequisites

  1. Datadog API Key (you can use an existing secret with your API key, or let the chart make one for you).
  2. By default Rancher Kubernetes Engine (RKE) does not allow unauthenticated access to the kubelet API which Datadog relies on for many of its metrics. When installing the cluster with RKE we need to provide extra arguments to the kubelet service.

    services:
    kubelet:
    extra_args:
    read-only-port: 10255

    NOTE: You should make sure this port is properly firewalled

  3. A Kubernetes 1.8 cluster attached to a Rancher installation.

Setup & Configuration

  1. The Datadog Rancher Chart is available by default in the Rancher library; there is also a Datadog chart in Helm stable, but we suggest using the Rancher library chart for ease of use. The Rancher library is enabled by default; if disabled this setting can be modified under Global->Catalogs.

Catalog

  1. The charts configuration options have been made available through the UI in Rancher by adding a questions.yaml file. To learn more about them, please refer to the values.yaml file, which has additional information and links describing the variables.

Catalog

Dashboards

If you plan to send mutliple clusters of data to the same Datadog endpoint, it’s useful to add the cluster name as a host tag (e.g. kube-cluster-name:CLUSTERNAME) when configuring the Helm chart. This will allow you to sort data by scope to a specific cluster, as well as group data by cluster within a dashboard. In the below dashboard we have grouped node data by cluster in a few of the default widgets for the clusters ‘dash-1’ and dash-2’.

Dashboard

Conclusion

Using Helm to deploy applications provides a tested, standardized deployment method. With the Rancher Catalog UI, Helm charts are even easier to use and configure. With the addition of the Datadog chart to the Rancher library, users can now leverage this workflow for one of the top enterprise ready solutions for monitoring and alerting with Kubernetes.

Kyle Rome

Kyle Rome is a Field Engineer for Rancher and has been working with Kubernetes for the past two years. He has a background in Distributed Systems Architecture and as a Java Software Engineer.

Source

11 Ways (Not) to Get Hacked

Author: Andrew Martin (ControlPlane)

Kubernetes security has come a long way since the project’s inception, but still contains some gotchas. Starting with the control plane, building up through workload and network security, and finishing with a projection into the future of security, here is a list of handy tips to help harden your clusters and increase their resilience if compromised.

The control plane is Kubernetes’ brain. It has an overall view of every container and pod running on the cluster, can schedule new pods (which can include containers with root access to their parent node), and can read all the secrets stored in the cluster. This valuable cargo needs protecting from accidental leakage and malicious intent: when it’s accessed, when it’s at rest, and when it’s being transported across the network.

1. TLS Everywhere

TLS should be enabled for every component that supports it to prevent traffic sniffing, verify the identity of the server, and (for mutual TLS) verify the identity of the client.

Note that some components and installation methods may enable local ports over HTTP and administrators should familiarize themselves with the settings of each component to identify potentially unsecured traffic.

Source

This network diagram by Lucas Käldström demonstrates some of the places TLS should ideally be applied: between every component on the master, and between the Kubelet and API server. Kelsey Hightower‘s canonical Kubernetes The Hard Way provides detailed manual instructions, as does etcd’s security model documentation.

Autoscaling Kubernetes nodes was historically difficult, as each node requires a TLS key to connect to the master, and baking secrets into base images is not good practice. Kubelet TLS bootstrapping provides the ability for a new kubelet to create a certificate signing request so that certificates are generated at boot time.

2. Enable RBAC with Least Privilege, Disable ABAC, and Monitor Logs

Role-based access control provides fine-grained policy management for user access to resources, such as access to namespaces.

Kubernetes’ ABAC (Attribute Based Access Control) has been superseded by RBAC since release 1.6, and should not be enabled on the API server. Use RBAC instead:

–authorization-mode=RBAC

Or use this flag to disable it in GKE:

–no-enable-legacy-authorization

There are plenty of good examples of RBAC policies for cluster services, as well as the docs. And it doesn’t have to stop there – fine-grained RBAC policies can be extracted from audit logs with audit2rbac.

Incorrect or excessively permissive RBAC policies are a security threat in case of a compromised pod. Maintaining least privilege, and continuously reviewing and improving RBAC rules, should be considered part of the “technical debt hygiene” that teams build into their development lifecycle.

Audit Logging (beta in 1.10) provides customisable API logging at the payload (e.g. request and response), and also metadata levels. Log levels can be tuned to your organisation’s security policy – GKE provides sane defaults to get you started.

For read requests such as get, list, and watch, only the request object is saved in the audit logs; the response object is not. For requests involving sensitive data such as Secret and ConfigMap, only the metadata is exported. For all other requests, both request and response objects are saved in audit logs.

Don’t forget: keeping these logs inside the cluster is a security threat in case of compromise. These, like all other security-sensitive logs, should be transported outside the cluster to prevent tampering in the event of a breach.

3. Use Third Party Auth for API Server

Centralising authentication and authorisation across an organisation (aka Single Sign On) helps onboarding, offboarding, and consistent permissions for users.

Integrating Kubernetes with third party auth providers (like Google or Github) uses the remote platform’s identity guarantees (backed up by things like 2FA) and prevents administrators having to reconfigure the Kubernetes API server to add or remove users.

Dex is an OpenID Connect Identity (OIDC) and OAuth 2.0 provider with pluggable connectors. Pusher takes this a stage further with some custom tooling, and there are some other helpers available with slightly different use cases.

4. Separate and Firewall your etcd Cluster

etcd stores information on state and secrets, and is a critical Kubernetes component – it should be protected differently from the rest of your cluster.

Write access to the API server’s etcd is equivalent to gaining root on the entire cluster, and even read access can be used to escalate privileges fairly easily.

The Kubernetes scheduler will search etcd for pod definitions that do not have a node. It then sends the pods it finds to an available kubelet for scheduling. Validation for submitted pods is performed by the API server before it writes them to etcd, so malicious users writing directly to etcd can bypass many security mechanisms – e.g. PodSecurityPolicies.

etcd should be configured with peer and client TLS certificates, and deployed on dedicated nodes. To mitigate against private keys being stolen and used from worker nodes, the cluster can also be firewalled to the API server.

5. Rotate Encryption Keys

A security best practice is to regularly rotate encryption keys and certificates, in order to limit the “blast radius” of a key compromise.

Kubernetes will rotate some certificates automatically (notably, the kubelet client and server certs) by creating new CSRs as its existing credentials expire.

However, the symmetric encryption keys that the API server uses to encrypt etcd values are not automatically rotated – they must be rotated manually. Master access is required to do this, so managed services (such as GKE or AKS) abstract this problem from an operator.

With minimum viable security on the control plane the cluster is able to operate securely. But, like a ship carrying potentially dangerous cargo, the ship’s containers must be protected to contain that cargo in the event of an unexpected accident or breach. The same is true for Kubernetes workloads (pods, deployments, jobs, sets, etc.) – they may be trusted at deployment time, but if they’re internet-facing there’s always a risk of later exploitation. Running workloads with minimal privileges and hardening their runtime configuration can help to mitigate this risk.

6. Use Linux Security Features and PodSecurityPolicies

The Linux kernel has a number of overlapping security extensions (capabilities, SELinux, AppArmor, seccomp-bpf) that can be configured to provide least privilege to applications.

Tools like bane can help to generate AppArmor profiles, and docker-slim for seccomp profiles, but beware – a comprehensive test suite it required to exercise all code paths in your application when verifying the side effects of applying these policies.

PodSecurityPolicies can be used to mandate the use of security extensions and other Kubernetes security directives. They provide a minimum contract that a pod must fulfil to be submitted to the API server – including security profiles, the privileged flag, and the sharing of host network, process, or IPC namespaces.

These directives are important, as they help to prevent containerised processes from escaping their isolation boundaries, and Tim Allclair‘s example PodSecurityPolicy is a comprehensive resource that you can customise to your use case.

7. Statically Analyse YAML

Where PodSecurityPolicies deny access to the API server, static analysis can also be used in the development workflow to model an organisation’s compliance requirements or risk appetite.

Sensitive information should not be stored in pod-type YAML resource (deployments, pods, sets, etc.), and sensitive configmaps and secrets should be encrypted with tools such as vault (with CoreOS’s operator), git-crypt, sealed secrets, or cloud provider KMS.

Static analysis of YAML configuration can be used to establish a baseline for runtime security. kubesec generates risk scores for resources:

{
“score”: -30,
“scoring”: {
“critical”: [{
“selector”: “containers[] .securityContext .privileged == true”,
“reason”: “Privileged containers can allow almost completely unrestricted host access”
}],
“advise”: [{
“selector”: “containers[] .securityContext .runAsNonRoot == true”,
“reason”: “Force the running image to run as a non-root user to ensure least privilege”
}, {
“selector”: “containers[] .securityContext .capabilities .drop”,
“reason”: “Reducing kernel capabilities available to a container limits its attack surface”,
“href”: “https://kubernetes.io/docs/tasks/configure-pod-container/security-context/”
}]
}
}

And kubetest is a unit test framework for Kubernetes configurations:

#// vim: set ft=python:
def test_for_team_label():
if spec[“kind”] == “Deployment”:
labels = spec[“spec”][“template”][“metadata”][“labels”]
assert_contains(labels, “team”, “should indicate which team owns the deployment”)

test_for_team_label()

These tools “shift left” (moving checks and verification earlier in the development cycle). Security testing in the development phase gives users fast feedback about code and configuration that may be rejected by a later manual or automated check, and can reduce the friction of introducing more secure practices.

8. Run Containers as a Non-Root User

Containers that run as root frequently have far more permissions than their workload requires which, in case of compromise, could help an attacker further their attack.

Containers still rely on the traditional Unix security model (called discretionary access control or DAC) – everything is a file, and permissions are granted to users and groups.

User namespaces are not enabled in Kubernetes. This means that a container’s user ID table maps to the host’s user table, and running a process as the root user inside a container runs it as root on the host. Although we have layered security mechanisms to prevent container breakouts, running as root inside the container is still not recommended.

Many container images use the root user to run PID 1 – if that process is compromised, the attacker has root in the container, and any mis-configurations become much easier to exploit.

Bitnami has done a lot of work moving their container images to non-root users (especially as OpenShift requires this by default), which may ease a migration to non-root container images.

This PodSecurityPolicy snippet prevents running processes as root inside a container, and also escalation to root:

# Required to prevent escalations to root.
allowPrivilegeEscalation: false
runAsUser:
# Require the container to run without root privileges.
rule: ‘MustRunAsNonRoot’

Non-root containers cannot bind to the privileged ports under 1024 (this is gated by the CAP_NET_BIND_SERVICE kernel capability), but services can be used to disguise this fact. In this example the fictional MyApp application is bound to port 8443 in its container, but the service exposes it on 443 by proxying the request to the targetPort:

kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: MyApp
ports:
– protocol: TCP
port: 443
targetPort: 8443

Having to run workloads as a non-root user is not going to change until user namespaces are usable, or the ongoing work to run containers without root lands in container runtimes.

9. Use Network Policies

By default, Kubernetes networking allows all pod to pod traffic; this can be restricted using a Network Policy .

Traditional services are restricted with firewalls, which use static IP and port ranges for each service. As these IPs very rarely change they have historically been used as a form of identity. Containers rarely have static IPs – they are built to fail fast, be rescheduled quickly, and use service discovery instead of static IP addresses. These properties mean that firewalls become much more difficult to configure and review.

As Kubernetes stores all its system state in etcd it can configure dynamic firewalling – if it is supported by the CNI networking plugin. Calico, Cilium, kube-router, Romana, and Weave Net all support network policy.

It should be noted that these policies fail-closed, so the absence of a podSelector here defaults to a wildcard:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
spec:
podSelector:

Here’s an example NetworkPolicy that denies all egress except UDP 53 (DNS), which also prevents inbound connections to your application. NetworkPolicies are stateful, so the replies to outbound requests still reach the application.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: myapp-deny-external-egress
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
– Egress
egress:
– ports:
– port: 53
protocol: UDP
– to:
– namespaceSelector: {}

Kubernetes network policies can not be applied to DNS names. This is because DNS can resolve round-robin to many IPs, or dynamically based on the calling IP, so network policies can be applied to a fixed IP or podSelector (for dynamic Kubernetes IPs) only.

Best practice is to start by denying all traffic for a namespace and incrementally add routes to allow an application to pass its acceptance test suite. This can become complex, so ControlPlane hacked together netassert – network security testing for DevSecOps workflows with highly parallelised nmap:

k8s: # used for Kubernetes pods
deployment: # only deployments currently supported
test-frontend: # pod name, defaults to `default` namespace
test-microservice: 80 # `test-microservice` is the DNS name of the target service
test-database: -80 # `test-frontend` should not be able to access test-database’s port 80
169.254.169.254: -80, -443 # AWS metadata API
metadata.google.internal: -80, -443 # GCP metadata API

new-namespace:test-microservice: # `new-namespace` is the namespace name
test-database.new-namespace: 80 # longer DNS names can be used for other namespaces
test-frontend.default: 80
169.254.169.254: -80, -443 # AWS metadata API
metadata.google.internal: -80, -443 # GCP metadata API

Cloud provider metadata APIs are a constant source of escalation (as the recent Shopify bug bounty demonstrates), so specific tests to confirm that the APIs are blocked on the container network helps to guard against accidental misconfiguration.

10. Scan Images and Run IDS

Web servers present an attack surface to the network they’re attached to: scanning an image’s installed files ensures the absence of known vulnerabilities that an attacker could exploit to gain remote access to the container. An IDS (Intrusion Detection System) detects them if they do.

Kubernetes permits pods into the cluster through a series of admission controller gates, which are applied to pods and other resources like deployments. These gates can validate each pod for admission or change its contents, and they now support backend webhooks.

These webhooks can be used by container image scanning tools to validate images before they are deployed to the cluster. Images that have failed checks can be refused admission.

Scanning container images for known vulnerabilities can reduce the window of time that an attacker can exploit a disclosed CVE. Free tools such as CoreOS’s Clair and Aqua’s Micro Scanner should be used in a deployment pipeline to prevent the deployment of images with critical, exploitable vulnerabilities.

Tools such as Grafeas can store image metadata for constant compliance and vulnerability checks against a container’s unique signature (a content addressable hash). This means that scanning a container image with that hash is the same as scanning the images deployed in production, and can be done continually without requiring access to production environments.

Unknown Zero Day vulnerabilities will always exist, and so intrusion detection tools such as Twistlock, Aqua, and Sysdig Secure should be deployed in Kubernetes. IDS detects unusual behaviours in a container and pauses or kills it – Sysdig’s Falco is a an Open Source rules engine, and an entrypoint to this ecosystem.

The next stage of security’s “cloud native evolution” looks to be the service mesh, although adoption may take time – migration involves shifting complexity from applications to the mesh infrastructure, and organisations will be keen to understand best-practice.

11. Run a Service Mesh

A service mesh is a web of encrypted persistent connections, made between high performance “sidecar” proxy servers like Envoy and Linkerd. It adds traffic management, monitoring, and policy – all without microservice changes.

Offloading microservice security and networking code to a shared, battle tested set of libraries was already possible with Linkerd, and the introduction of Istio by Google, IBM, and Lyft, has added an alternative in this space. With the addition of SPIFFE for per-pod cryptographic identity and a plethora of other features, Istio could simplify the deployment of the next generation of network security.

In “Zero Trust” networks there may be no need for traditional firewalling or Kubernetes network policy, as every interaction occurs over mTLS (mutual TLS), ensuring that both parties are not only communicating securely, but that the identity of both services is known.

This shift from traditional networking to Cloud Native security principles is not one we expect to be easy for those with a traditional security mindset, and the Zero Trust Networking book from SPIFFE’s Evan Gilman is a highly recommended introduction to this brave new world.

Istio 0.8 LTS is out, and the project is rapidly approaching a 1.0 release. Its stability versioning is the same as the Kubernetes model: a stable core, with individual APIs identifying themselves under their own alpha/beta stability namespace. Expect to see an uptick in Istio adoption over the coming months.

Cloud Native applications have a more fine-grained set of lightweight security primitives to lock down workloads and infrastructure. The power and flexibility of these tools is both a blessing and curse – with insufficient automation it has become easier to expose insecure workloads which permit breakouts from the container or its isolation model.

There are more defensive tools available than ever, but caution must be taken to reduce attack surfaces and the potential for misconfiguration.

However if security slows down an organisation’s pace of feature delivery it will never be a first-class citizen. Applying Continuous Delivery principles to the software supply chain allows an organisation to achieve compliance, continuous audit, and enforced governance without impacting the business’s bottom line.

Iteratating quickly on security is easiest when supported by a comprehensive test suite. This is achieved with Continuous Security – an alternative to point-in-time penetration tests, with constant pipeline validation ensuring an organisation’s attack surface is known, and the risk constantly understood and managed.

This is ControlPlane’s modus operandi: if we can help kickstart a Continuous Security discipline, deliver Kubernetes security and operations training, or co-implement a secure cloud native evolution for you, please get in touch.

Andrew Martin is a co-founder at @controlplaneio and tweets about cloud native security at @sublimino

Source

Introducing Navigator – Jetstack Blog

By James Munnelly

Today we are proud to introduce Navigator, a centralised controller for managing the lifecycle of complex distributed applications. It intends to be the central control point for creating, updating, managing and monitoring stateful databases and services with Kubernetes.

Navigator is open source and extensible from day one. We launch today with support for Elasticsearch in alpha, with Couchbase support soon to land in the next few weeks, and more planned.

We’ll also be working closely with the Service Catalog Special Interest Group (SIG) to make it even easier to bring the power of Navigator to your applications by implementing the Open Service Broker API. You can read more about service catalog on GitHub.

Containerisation is quickly becoming the standard in modern enterprises. At Jetstack, we’ve seen a massive influx of organisations wishing to embrace containers and Kubernetes, to reduce costs, improve reliability and standardise the way in which they deploy their systems.

So far, this has not been so true for databases. A combination of mistrust and immaturity in today’s orchestration systems leads teams to utilise vendor-provided solutions such as RDS, Cloudant and Cloud SQL (to name but a few). These services work well, and push the onus of database management to a third-party. However, it can also mean that you are now locked-in to a particular cloud vendor, data interoperability is restricted to the feature set provided, and you are susceptible to business decisions which may not best align with your interests.

There have been a number of projects that attempt to simplify deployment of stateful services in containers, but few attempt to deliver an end-to-end solution that can deploy, manage and monitor your services. We want to make common databases and distributed applications first-class citizens in Kubernetes. we’ve developed Navigator to assume the role of database administrator within your own organisation, building on the tried-and-tested foundations of Kubernetes.

Navigator is the product of Jetsack’s experiences deploying stateful services on Kubernetes. For over a year, we have worked closely with database vendors and customers alike, and evolved our approach. We build heavily on the Operator model, but with a number of differences, including enabling support for multiple applications, whilst offering the same rich feature set for each.

In order to allow Navigator to support many different applications, it’s designed to remove application-specific knowledge from the Navigator itself. Instead, Pilots are programmed to directly interact with your system in order to trigger application-level actions in response to scale-up, scale-down or any other cluster-level action. Navigator is then able to communicate directly with these Pilots in order to instruct them as to any actions that must be taken.

Navigator architecture

Navigator architecture

Because the Pilots are able to communicate with Navigator, as well as control the database process, they are able to interrupt scale events and
inform Navigator of any state changes within the application that may occur. Without this model, it would not be possible for accurate and precise
management of the application’s state.

You can get started with Navigator by following our quick-start guide on GitHub.

We’d love to hear feedback and even accept contributions towards either the Navigator itself, or the Pilots.

Source

Deploying Istio on a Kubernetes Cluster using Rancher 2.0

 

Expert Training in Kubernetes and Rancher

Join our free online training sessions to learn how to manage Kubernetes workloads with Rancher.

Sign up here

Service mesh is a new technology stack aimed at solving the connectivity problem between cloud native applications. If you want to build a cloud native application, you need a service mesh. One of the big players in the service mesh world is Istio. Istio is best described in their own about page. It’s a very promising service mesh solution, based on Envoy Proxy, having multiple tech giants contributing to it.

Below is an overview of how you can deploy Istio using Rancher 2.0

Istio at the moment works best with Kubernetes, but they are working to bring support for other platforms too. So to deploy Istio and demonstrate some of its capabilities, there’s a need for a kubernetes cluster. To do that is pretty easy using Rancher 2.0.

Prerequisites

To perform this demo, you will need the following:

  • a Google Cloud Platform account, the free tier provided is more than enough;
  • one ubuntu 16.04 instance (this is where the Rancher instance will be running)
  • a kubernetes cluster deployed to Google Cloud Platform, using Google Kubernetes Engine. This demo uses version 1.10.5-gke.2, which is the latest available at the time of writing;
  • Istio version 0.8.0, the latest available at the time of this writing.

Normally the steps provided should be valid with newer versions, too.

Starting a Rancher 2.0 instance

To begin, start a Rancher 2.0 instance. There’s a very intuitive getting started guide for this purpose here. Just to be sure you’ll get the information you need, the steps will be outlined below as well.

This example will use Google Cloud Platform, so let’s start an Ubuntu instance there and allow HTTP and HTTPs traffic to it, either via Console or CLI. Here’s an example command to achieve the above:

gcloud compute –project=rancher-20 instances create rancher-20
–zone=europe-west2-a –machine-type=n1-standard-1
–tags=http-server,https-server –image=ubuntu-1604-xenial-v20180627
–image-project=ubuntu-os-cloud

gcloud compute –project=rancher-20 firewall-rules create default-allow-http
–direction=INGRESS –priority=1000 –network=default –action=ALLOW
–rules=tcp:80 –source-ranges=0.0.0.0/0 –target-tags=http-server

gcloud compute –project=rancher-20 firewall-rules create default-allow-https
–direction=INGRESS –priority=1000 –network=default –action=ALLOW
–rules=tcp:443 –source-ranges=0.0.0.0/0 –target-tags=https-server

Make sure you have at least 1 vCPU and about 4GB of RAM available for the Rancher instance.

The next step is to ssh into the instance and install Docker. Once Docker is installed, start Rancher and verify that it’s running:

[email protected]:~$ sudo docker run -d –restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher
Unable to find image ‘rancher/rancher:latest’ locally
latest: Pulling from rancher/rancher
6b98dfc16071: Pull complete
4001a1209541: Pull complete
6319fc68c576: Pull complete
b24603670dc3: Pull complete
97f170c87c6f: Pull complete
c5880aba2145: Pull complete
de3fa5ee4e0d: Pull complete
c973e0300d3b: Pull complete
d0f63a28838b: Pull complete
b5f0c036e778: Pull complete
Digest: sha256:3f042503cda9c9de63f9851748810012de01de380d0eca5f1f296d9b63ba7cd5
Status: Downloaded newer image for rancher/rancher:latest
2f496a88b82abaf28e653567d8754b3b24a2215420967ed9b817333ef6d6c52f
[email protected]:~$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2f496a88b82a rancher/rancher “rancher –http-list…” About a minute ago Up 59 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp elegant_volhard

Get the public IP address of the instance and point your browser to it:

$ gcloud compute instances describe rancher-20 –project=rancher-20 –format=”value(networkInterfaces[0].accessConfigs[0].natIP)”
35.189.72.39

You should be redirected to a HTTPS page of Rancher and you should see a warning from your browser, because Rancher uses a self signed certificate. Ignore the warnings, because this is the instance that you have started (never do that on untrusted sites!), and proceed to set up Rancher 2.0 by setting the admin password and server URL. That was it – you now have Rancher 2.0 running. Now it’s time to start your Kubernetes cluster.

Starting a Kubernetes Cluster

To start a Kubernetes cluster, you’ll need a Google Cloud Service Account with the following Roles attached to it: Compute Viewer, Kubernetes Engine Admin, Service Account User, Project Viewer. Afterwards you need to generate service account keys, as described here.

Now get your service account keys (it’s safe to use the default Compute Engine service account); you will need your service account keys to start a Kubernetes cluster using Rancher 2.0:

gcloud iam service-accounts keys create ./key.json
–iam-account <SA-NAME>@developer.gserviceaccount.com

Note the <SA-NAME>@developer.gserviceaccount.com value; you will need it later.

Now you’re ready to start your cluster. Go to the Rancher dashboard and click on Add Cluster. Make sure you do the following:
* select Google Container Engine for the hosted Kubernetes provider;
* give your cluster a name, for example rancher-demo;
* import or copy/paste the service key details from key.json file (generated above) into Service Account field;

Proceed with Configure Nodes option and select the following:
* for Kubernetes Version, it should be safe to select the latest available version; this test was done on 1.10.5-gke.2 version;
* select the zone that is closest to you;
* Machine Type needs to be at least n1-standard-1;
* for Istio Demo, the Node Count should be at least 4;

Once these are selected, your setup would look like the image below:

Rancher add cluster

Click with confidence on Create

After several minutes you should see your cluster as active in Rancher dashboard. Remember that <SA-NAME>@developer.gserviceaccount.com value? You need it now, to grant cluster admin permissions to the current user (admin permissions are required to create the necessary RBAC rules for Istio). To do that, click on the rancher-demo Cluster Name in Rancher Dashboard, that will take you to rancher-demo Cluster dashboard, it should look similar to the image below:

rancher-demo Cluster Dashboard

Now Launch kubectl, this will open a kubectl command line for this particular cluster. You can also export the Kubeconfig File to use with your locally installed kubectl. For this purpose it should be enough to use the command line provided by Rancher. Once you have the command line opened, run the following command there:

> kubectl create clusterrolebinding cluster-admin-binding
–clusterrole=cluster-admin
–user=<SA-NAME>@developer.gserviceaccount.com

clusterrolebinding “cluster-admin-binding” created
>

Deploying Istio on Rancher

Istio has a Helm package and Rancher can consume that Helm package and install Istio. To get the official Istio Helm package, it’s best to add Istio’s repository to Rancher Apps Catalog. To do that, go to Rancher Global View, then to Catalogs option and select Add Catalog. Fill in there as follows:
* for name, let’s put there istio-github;
* in Catalog URL, paste the following URL: https://github.com/istio/istio.git (Rancher works with anything git clone can handle)
* the Branch part should allow you now to write the branch name, set it to master
It should look as in the screenshot below:

Rancher add Istio Helm Catalog

Hit Create

At this stage, you should be able to deploy Istio using Rancher’s Catalog. To do that, go to the Default Project of rancher-demo’s cluster and select Catalog Apps there. Once you click on Launch, you will be presented with a number of default available applications. As this demo is about Istio, select from All Catalogs, the istio-github catalog, that you’ve just created. This will present you with 2 options istio and istio-remote. Select View Details for istio one. You’ll be presented with the options to deploy Istio. Select the followings:
* let’s set the name to istio-demo;
* leave the template version to 0.8.0;
* the default namespace used for istio is istio-system, thus set the namespace to istio-system;
* by default, Istio doesn’t encrypt traffic between it’s components. That’s a very nice feature to have, let’s add it. On the same topic, Istio’s helm chart doesn’t add by default Grafana, that’s very useful to have, let’s add this one too. This is done by setting to true the global.controlPlaneSecurityEnabled and grafana.enabled variables. To do this:
– click Add Answer;
– put the variable name global.controlPlaneSecurityEnabled;
– set it’s Value to true;
– do the same for grafana.enabled;

All of the above should look like in the screenshot below:

Deploy Istio from Rancher Catalog

Everything looks good, click on Launch

Now if you look at the Workloads tab, you should see there all the components of Istio spinning up in your Cluster. Make sure all of the workloads are green. Also, check the Load Balancing tab, you should have istio-ingress and istio-ingressgateway there, both in Active state.

In case you have istio-ingressgateway in Pending state, you need to apply istio-ingressgateway service once again. To do that:
* click on Import Yaml;
* for Import Mode, select Cluster: Direct import of any resources into this cluster;
* copy/paste istio-demo-ingressgateway.yaml Service into the Import Yaml editor and hit Import:

This step should solve the Pending problem with istio-ingressgateway.

You should now check that all Istio’s Workloads, Load Balancing and Service Discovery parts are green in Rancher Dashboard.

One last thing to add, so Istio sidecar container is injected automatically into your pods, run the following kubectl command (you can launch kubectl from inside Rancher, as described above), to add a istio-injected label to your default namespace:

> kubectl label namespace default istio-injection=enabled
namespace “default” labeled
> kubectl get namespace -L istio-injection
NAME STATUS AGE ISTIO-INJECTION
cattle-system Active 1h
default Active 1h enabled
istio-system Active 37m
kube-public Active 1h
kube-system Active 1h
>

This label, will make sure that Istio-Sidecar-Injector will automatically inject Envoy containers into your application pods

Deploying Bookinfo sample app

Only now, you can deploy a test application and test the power of Istio. To do that, let’s deploy the Bookinfo sample application. The interesting part of this application is that it has 3 versions of the reviews app, running at the same time. Here’s where we can see some of Istio’s features.
Go to the rancher-demo Default project workloads to deploy the Bookinfo app:
* click on Import Yaml;
* download the following bookinfo.yaml to your local computer;
* upload it to Rancher by using the Read from file option, after you enter the Import Yaml menu;
* for the Import Mode select Cluster: Direct import of any resources into this cluster;
* click on Import

This should add 6 more workloads to your rancher-demo Default project. Just like in the screenshot below:

Rancher Bookinfo Workloads

Now to expose the Bookinfo app via Istio, you need to apply this bookinfo-gateway.yaml the same way as the bookinfo.yaml.
At this moment, you can access the bookinfo app with your browser. Get the external IP address of the istio-ingressgateway Load Balancer. There are several ways to find this IP address. From Rancher, you can go to Load Balancing, and from the right hand side menu select View in API, just like in the screenshot below:

View Load Balancer in API

It should open in a new browser tab, search there for publicEndpoints -> addresses and you should see the public IP address.
Another way is via kubectl:

> export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}’)
> echo $INGRESS_HOST

Point your browser to http://$/productpage and you should see the Bookinfo app. If you refresh your page multiple times, you should see 3 different versions for the Book Reviews part:
– first one with no stars;
– second one with black stars;
– third one with red stars.

Using istio, you can limit your app to route only to the first version of the app. To do that, Import the route-rule-all-v1.yaml into Rancher, wait for a couple of seconds, and then refresh the page multiple times. You should no longer see any stars on the reviews.

Another example is to route traffic to only a set of users. If you import route-rule-reviews-test-v2.yaml to Rancher, login to the Bookinfo app with username jason (no password needed), you should see only version 2 of the reviews (the one with the black stars). Logging out, will show you again only version 1 of the reviews app.

The power provided by Istio can already be seen. Of course, there are many more possibilities with Istio. With this setup created, you can play around with the tasks provided in Istio’s documentation

Istio’s telemetry

Now it’s time to dive into the even more useful features of Istio – the metrics provided by default.

Let’s start with Grafana. The variable grafana.enabled, that was set to true, when we deployed Istio, created a grafana instance, configured to collect Istio’s metrics and display them in several Dashboards. By default Grafana’s service isn’t exposed publicly, thus to view the metrics, you first need to expose Grafana’s service to a public IP address. There’s also the option to expose the service using NodePort, but this will require you to open that NodePort on all of the nodes from Google Cloud Platform firewall, and that’s one more task to deal with, thus it’s simpler to just expose it via a public IP address.

To do this, go to the Workloads under ranchers-demo Default project and select the Service Discovery tab. After all the work already done on the cluster, there should be about 5 services in the default namespace and 12 services in the istio-system namespace, all in Active state. Select the grafana service and from the right hand side menu, select View/Edit YAML, just like in the image below:

Rancher change grafana service

Find the line that says type: ClusterIP and change it to type: LoadBalancer and confidently click Save. Now it should provision a load balancer in Google Cloud Platform and expose Grafana there, on it’s default port 3000. To get the public IP address of Grafana, repeat the process used to find the IP address for bookinfo example, meaning either view grafana service in API, where you can find the IP address, or get it via kubectl:

export GRAFANA_HOST=$(kubectl -n istio-system get service grafana -o jsonpath='{.status.loadBalancer.ingress[0].ip}’)
echo $GRAFANA_HOST

Point your browser to http://$:3000/, select one of the Dashboards, for example Istio Service Dashboard. With previously applied configurations, we limited traffic to show only version 1 of reviews app. To see that on the graphs, select reviews.default.svc.cluster.local form the Service dropdown. Now generate some traffic from Rancher’s kubectl, using the following commands:

export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}’)
for i in ; do curl -o /dev/null -s -w “%n” http://$/productpage; sleep 0.2; done

Wait for about 5 minutes, to generate traffic for the Grafana to show on the Dashboard and after that, it should look like this:

Grafana Istio Service Dashboard

If you scroll a little bit on the Dashboard, under SERVICE WORKLOADS you can clearly see on Incoming Requests by Destination And Response Code graph, that requests for the Reviews app end up only on v1 endpoint. If you generate some requests to version 2 of the app, with the following command (remember that user jason has access to v2 of the reviews app):

export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}’)
for i in ; do curl -o /dev/null -s -w “%n” –cookie “user=jason” http://$/productpage; sleep 0.2; done

You should see requests appearing for the v2 app too, just like in the below screenshot:

Grafana Istio Services Graph for v1 and v2

In the same manner, there’s a possibility to expose and view other default metrics available from Istio, like Prometheus, Tracing and ServiceGraph.

Some final thoughts

As you already have seen, Istio is a very powerful and useful service mesh platform. It surely will become an essential tool in the Cloud Native world. The main problem, at the moment, is that it is not production ready, yet. To quote the one and only @kelseyhightower – “Don’t run out of here and deploy it in production. You’ll be on the news.” Anyways, you should definitely consider it, as it won’t take long until it will become production ready.

As for Rancher 2.0, it is very useful to see the whole kubernetes cluster state, all the workloads, services and pods. It provides an easy way to manage the cluster via the WebUI and deploy apps via Helm Charts, even for someone who isn’t very familiar with Kubernetes. With Rancher 2.0 you have everything you need to manage a kubernetes cluster and have a great overview of it’s state and I’m sure guys at Rancher will add more and more useful features to it.

Roman Doroschevici

Roman Doroschevici

github

Source

Kubernetes Wins the 2018 OSCON Most Impact Award

Kubernetes Wins the 2018 OSCON Most Impact Award

Authors: Brian Grant (Principal Engineer, Google) and Tim Hockin (Principal Engineer, Google)

We are humbled to be recognized by the community with this award.

We had high hopes when we created Kubernetes. We wanted to change the way cloud applications were deployed and managed. Whether we’d succeed or not was very uncertain. And look how far we’ve come in such a short time.

The core technology behind Kubernetes was informed by lessons learned from Google’s internal infrastructure, but nobody can deny the enormous role of the Kubernetes community in the success of the project. The community, of which Google is a part, now drives every aspect of the project: the design, development, testing, documentation, releases, and more. That is what makes Kubernetes fly.

While we actively sought partnerships and community engagement, none of us anticipated just how important the open-source community would be, how fast it would grow, or how large it would become. Honestly, we really didn’t have much of a plan.

We looked to other open-source projects for inspiration and advice: Docker (now Moby), other open-source projects at Google such as Angular and Go, the Apache Software Foundation, OpenStack, Node.js, Linux, and others. But it became clear that there was no clear-cut recipe we could follow. So we winged it.

Rather than rehashing history, we thought we’d share two high-level lessons we learned along the way.

First, in order to succeed, community health and growth needs to be treated as a top priority. It’s hard, and it is time-consuming. It requires attention to both internal project dynamics and outreach, as well as constant vigilance to build and sustain relationships, be inclusive, maintain open communication, and remain responsive to contributors and users. Growing existing contributors and onboarding new ones is critical to sustaining project growth, but that takes time and energy that might otherwise be spent on development. These things have to become core values in order for contributors to keep them going.

Second, start simple with how the project is organized and operated, but be ready to adopt to more scalable approaches as it grows. Over time, Kubernetes has transitioned from what was effectively a single team and git repository to many subgroups (Special Interest Groups and Working Groups), sub-projects, and repositories. From manual processes to fully automated ones. From informal policies to formal governance.

We certainly didn’t get everything right or always adapt quickly enough, and we constantly struggle with scale. At this point, Kubernetes has more than 20,000 contributors and is approaching one million comments on its issues and pull requests, making it one of the fastest moving projects in the history of open source.

Thank you to all our contributors and to all the users who’ve stuck with us on the sometimes bumpy journey. This project would not be what it is today without the community.

Source

Couchbase on OpenShift and Kubernetes – Jetstack Blog

By Matthew Bates

Jetstack are pleased to open source a proof-of-concept sidecar for deployment of managed Couchbase clusters on OpenShift. The project is the product of a close engineering collaboration with Couchbase, Red Hat and Amadeus, and a demo was presented at the recent Red Hat Summit in Boston, MA.

This project provides a sidecar container that can be used alongside official Couchbase images to provide a scalable and flexible Couchbase deployment for OpenShift and Kubernetes. The sidecars manage cluster lifecycle, including registering new nodes into the Couchbase cluster, automatically triggering cluster rebalances, and handling migration of data given a scale-down or node failure event.

Couchbase Server is a NoSQL document database with a distributed architecture for performance, scalability, and availability. It enables developers to build applications easier and faster by leveraging the power of SQL with the flexibility of JSON.

In recent versions of OpenShift (and the upstream Kubernetes project), there has seen significant advancement in a number of the building blocks required for deployment of distributed applications. Notably:

  • StatefulSet: (nee PetSet, and now in technical preview as of OpenShift
    3.5
    ) provides unique and stable identity and storage to pods, and guarantees deployment order and scaling. This is in contrast to a Deployment or ReplicaSet where pod replicas do not maintain identity across restart/rescheduling, may have the same volume storage properties – hence, these resources are suited to stateless applications.
  • Dynamic volume provisioning: first introduced in technology preview in 3.1.1, and now GA in 3.3, this feature enables storage to be
    dynamically provisioned ‘on-demand’ in a supported cloud environment (e.g. AWS, GCP, OpenStack). The StatefulSet controller automatically creates requests for storage (PersistentVolumeClaim – PVC) per pod, and the storage is provisioned (PersistentVolume – PV). The unique 1-to-1 binding between PV and PVC ensures a pod is always reunited with its same volume, even if scheduled on another node in the instance of failure.

Whilst OpenShift (or Kubernetes), by utilising its generic concepts of StatefulSet and dynamic volume provisioning, will make sure the right pods are scheduled and running, it cannot account for Couchbase-specific requirements in its decision making process. For example, registering new nodes when scaling up, rebalancing and also handling migration of data at a scale-down or on node failure. The pod and node events are well-known to OpenShift/Kubernetes, but the actions required are very much database-specific.

In this PoC, we’ve codified the main Couchbase cluster lifecycle operations into a sidecar container that sits alongside a standard Couchbase container in a pod. The sidecar uses the APIs of OpenShift/Kubernetes and Couchbase to determine cluster state, and it will safely and appropriately respond to Couchbase cluster events, such as scale-up/down and node failure.

For instance, the sidecar can respond to the following events:

  • Scale-up: the sidecar determines the node is new to the cluster and it is initialized and joined to the cluster. This prompts a rebalance.
  • Scale-down: a preStop hook (pre-container shutdown) is executed and the sidecar safely removes the node from the cluster, rebalancing as necessary.
  • Readiness: the sidecar connects to the local Couchbase container and determines its health. The result of the readiness check is used to determine service availability in OpenShift.

Experiment with the open source sidecar

The proof-of-concept sidecar has now been open sourced at https://github.com/jetstack-experimental/couchbase-sidecar. At this repository, find instructions on how to get started with OpenShift (and Kubernetes too with a Helm chart). Feedback and contributions are welcome, but please note that this is strictly a proof-of-concept and should not be used in production. We look forward to future versions, in which the sidecar will be improved and extended, and battle-tested at scale, in a journey to a production release. Let us know what you think!

Source