root – Page 3 – Art2Dec SoftLab

March 12, 2019April 6, 2019

Announcing Submariner, Multi-Cluster Network Connectivity for Kubernetes

Today we are proud to announce Submariner, a new open-source project enabling network connectivity between Kubernetes clusters. We launched the project to provide network connectivity for microservices deployed in multiple Kubernetes clusters that need to communicate with each other. This new solution overcomes barriers to connectivity between Kubernetes clusters and allows for a host of new multi-cluster implementations, such as database replication within Kubernetes across geographic regions and deploying service mesh across clusters.

Organizations are looking for Kubernetes as the standard computing platform across all public and private cloud infrastructure. Submariner allows these organizations to seamlessly connect, scale, and migrate workloads across Kubernetes clusters deployed on any cloud.

Network Connectivity Across Clusters with Submariner

Historically, Kubernetes deployments implement network virtualization, enabling containers to run on multiple nodes within the same cluster to communicate with each other. However, containers running in different Kubernetes clusters must communicate with each other through ingress controllers or node ports. Submariner now creates the necessary tunnels and routes needed to enable containers in different Kubernetes clusters to connect directly. Key features of Submariner include:

Compatibility and connectivity with existing clusters: Users can deploy Submariner into existing Kubernetes clusters, with the addition of Layer-3 network connectivity between pods in different clusters.
Secure paths: Encrypted network connectivity is implemented using IPSec tunnels.
Various connectivity mechanisms: While IPsec is the default connectivity mechanism out of the box, Rancher will enable different interconnectivity plugins in the near future.
Centralized broker : Users can register and maintain a set of healthy gateway nodes.
Flexible service discovery: Submariner provides service discovery across multiple Kubernetes clusters.
CNI compatibility: Works with popular CNI drivers such as Flannel and Calico.

Developers who are interested in downloading, installing and playing with this new networking solution should visit https://submariner.io or follow the project on https://github.com/rancher/submariner. Enterprises who need assistance in deploying and managing Submariner can contact info@rancher.com.

Source

March 11, 2019April 6, 2019

Considerations When Designing Distributed Systems

Today’s applications are marvels of distributed systems development. Each function or service that makes up an application may be executing on a different system, based upon a different system architecture, that is housed in a different geographical location, and written in a different computer language. Components of today’s applications might be hosted on a powerful system carried in the owner’s pocket and communicating with application components or services that are replicated in data centers all over the world.

What’s amazing about this, is that individuals using these applications typically are not aware of the complex environment that responds to their request for the local time, local weather, or for directions to their hotel.

Let’s pull back the curtain and look at the industrial sorcery that makes this all possible and contemplate the thoughts and guidelines developers should keep in mind when working with this complexity.

The Evolution of System Design

Figure 1: Evolution of system design over time

Source: Interaction Design Foundation, The Social Design of Technical Systems: Building technologies for communities

Application development has come a long way from the time that programmers wrote out applications, hand compiled them into the language of the machine they were using, and then entered individual machine instructions and data directly into the computer’s memory using toggle switches.

As processors became more and more powerful, system memory and online storage capacity increased, and computer networking capability dramatically increased, approaches to development also changed. Data can now be transmitted from one side of the planet to the other faster than it used to be possible for early machines to move data from system memory into the processor itself!

Let’s look at a few highlights of this amazing transformation.

Monolithic Design

Early computer programs were based upon a monolithic design with all of the application components were architected to execute on a single machine. This meant that functions such as the user interface (if users were actually able to interact with the program), application rules processing, data management, storage management, and network management (if the computer was connected to a computer network) were all contained within the program.

While simpler to write, these programs become increasingly complex, difficult to document, and hard to update or change. At this time, the machines themselves represented the biggest cost to the enterprise and so applications were designed to make the best possible use of the machines.

Client/Server Architecture

As processors became more powerful, system and online storage capacity increased, and data communications became faster and more cost-efficient, application design evolved to match pace. Application logic was refactored or decomposed, allowing each to execute on different machines and the ever-improving networking was inserted between the components. This allowed some functions to migrate to the lowest cost computing environment available at the time. The evolution flowed through the following stages:

Terminals and Terminal Emulation

Early distributed computing relied on special-purpose user access devices called terminals. Applications had to understand the communications protocols they used and issue commands directly to the devices. When inexpensive personal computing (PC) devices emerged, the terminals were replaced by PCs running a terminal emulation program.

At this point, all of the components of the application were still hosted on a single mainframe or minicomputer.

Light Client

As PCs became more powerful, supported larger internal and online storage, and network performance increased, enterprises segmented or factored their applications so that the user interface was extracted and executed on a local PC. The rest of the application continued to execute on a system in the data center.

Often these PCs were less costly than the terminals that they replaced. They also offered additional benefits. These PCs were multi-functional devices. They could run office productivity applications that weren’t available on the terminals they replaced. This combination drove enterprises to move to client/server application architectures when they updated or refreshed their applications.

Midrange Client

PC evolution continued at a rapid pace. Once more powerful systems with larger storage capacities were available, enterprises took advantage of them by moving even more processing away from the expensive systems in the data center out to the inexpensive systems on users’ desks. At this point, the user interface and some of the computing tasks were migrated to the local PC.

This allowed the mainframes and minicomputers (now called servers) to have a longer useful life, thus lowering the overall cost of computing for the enterprise.

Heavy client

As PCs become more and more powerful, more application functions were migrated from the backend servers. At this point, everything but data and storage management functions had been migrated.

Enter the Internet and the World Wide Web

The public internet and the World Wide Web emerged at this time. Client/server computing continued to be used. In an attempt to lower overall costs, some enterprises began to re-architect their distributed applications so they could use standard internet protocols to communicate and substituted a web browser for the custom user interface function. Later, some of the application functions were rewritten in Javascript so that they could execute locally on the client’s computer.

Server Improvements

Industry innovation wasn’t focused solely on the user side of the communications link. A great deal of improvement was made to the servers as well. Enterprises began to harness together the power of many smaller, less expensive industry standard servers to support some or all of their mainframe-based functions. This allowed them to reduce the number of expensive mainframe systems they deployed.

Soon, remote PCs were communicating with a number of servers, each supporting their own component of the application. Special-purpose database and file servers were adopted into the environment. Later, other application functions were migrated into application servers.

Networking was another area of intense industry focus. Enterprises began using special-purpose networking servers that provided fire walls and other security functions, file caching functions to accelerate data access for their applications, email servers, web servers, web application servers, distributed name servers that kept track of and controlled user credentials for data and application access. The list of networking services that has been encapsulated in an appliance server grows all the time.

Object-Oriented Development

The rapid change in PC and server capabilities combined with the dramatic price reduction for processing power, memory and networking had a significant impact on application development. No longer where hardware and software the biggest IT costs. The largest costs were communications, IT services (the staff), power, and cooling.

Software development, maintenance, and IT operations took on a new importance and the development process was changed to reflect the new reality that systems were cheap and people, communications, and power were increasingly expensive.

Figure 2: Worldwide IT spending forcast

Source: Gartner Worldwide IT Spending Forecast, Q1 2018

Enterprises looked to improved data and application architectures as a way to make the best use of their staff. Object-oriented applications and development approaches were the result. Many programming languages such as the following supported this approach:

C++
C#
COBOL
Java
PHP
Python
Ruby

Application developers were forced to adapt by becoming more systematic when defining and documenting data structures. This approach also made maintaining and enhancing applications easier.

Open-Source Software

Opensource.com offers the following definition for open-source software: “Open source software is software with source code that anyone can inspect, modify, and enhance.” It goes on to say that, “some software has source code that only the person, team, or organization who created it — and maintains exclusive control over it — can modify. People call this kind of software ‘proprietary’ or ‘closed source’ software.”

Only the original authors of proprietary software can legally copy, inspect, and alter that software. And in order to use proprietary software, computer users must agree (often by accepting a license displayed the first time they run this software) that they will not do anything with the software that the software’s authors have not expressly permitted. Microsoft Office and Adobe Photoshop are examples of proprietary software.

Although open-source software has been around since the very early days of computing, it came to the forefront in the 1990s when complete open-source operating systems, virtualization technology, development tools, database engines, and other important functions became available. Open-source technology is often a critical component of web-based and distributed computing. Among others, the open-source offerings in the following categories are popular today:

Development tools
Application support
Databases (flat file, SQL, No-SQL, and in-memory)
Distributed file systems
Message passing/queueing
Operating systems
Clustering

Distributed Computing

The combination of powerful systems, fast networks, and the availability of sophisticated software has driven major application development away from monolithic towards more highly distributed approaches. Enterprises have learned, however, that sometimes it is better to start over than to try to refactor or decompose an older application.

When enterprises undertake the effort to create distributed applications, they often discover a few pleasant side effects. A properly designed application, that has been decomposed into separate functions or services, can be developed by separate teams in parallel.

Rapid application development and deployment, also known as DevOps, emerged as a way to take advantage of the new environment.

Service-Oriented Architectures

As the industry evolved beyond client/server computing models to an even more distributed approach, the phrase “service-oriented architecture” emerged. This approach was built on distributed systems concepts, standards in message queuing and delivery, and XML messaging as a standard approach to sharing data and data definitions.

Individual application functions are repackaged as network-oriented services that receive a message requesting they perform a specific service, they perform that service, and then the response is sent back to the function that requested the service.

This approach offers another benefit, the ability for a given service to be hosted in multiple places around the network. This offers both improved overall performance and improved reliability.

Workload management tools were developed that receive requests for a service, review the available capacity, forward the request to the service with the most available capacity, and then send the response back to the requester. If a specific service doesn’t respond in a timely fashion, the workload manager simply forwards the request to another instance of the service. It would also mark the service that didn’t respond as failed and wouldn’t send additional requests to it until it received a message indicating that it was still alive and healthy.

What Are the Considerations for Distributed Systems

Now that we’ve walked through over 50 years of computing history, let’s consider some rules of thumb for developers of distributed systems. There’s a lot to think about because a distributed solution is likely to have components or services executing in many places, on different types of systems, and messages must be passed back and forth to perform work. Care and consideration are absolute requirements to be successful creating these solutions. Expertise must also be available for each type of host systme, development tool, and messaging system in use.

Nailing Down What Needs to Be Done

One of the first things to consider is what needs to be accomplished! While this sounds simple, it’s incredibly important.

It’s amazing how many developers start building things before they know, in detail, what is needed. Often, this means that they build unnecessary functions and waste their time. To quote Yogi Berra, “if you don’t know where you are going, you’ll end up someplace else.”

A good place to start is knowing what needs to be done, what tools and services are already available, and what people using the final solution should see.

Interactive Versus Batch

Since fast responses and low latency are often requirements, it would be wise to consider what should be done while the user is waiting and what can be put into a batch process that executes on an event-driven or time-driven schedule.

After the initial segmentation of functions has been considered, it is wise to plan when background, batch processes need to execute, what data do these functions manipulate, and how to make sure these functions are reliable, are available when needed, and how to prevent the loss of data.

Where Should Functions Be Hosted?

Only after the “what” has been planned in fine detail, should the “where” and “how” be considered. Developers have their favorite tools and approaches and often will invoke them even if they might not be the best choice. As Bernard Baruch was reported to say, “if all you have is a hammer, everything looks like a nail.”

It is also important to be aware of corporate standards for enterprise development. It isn’t wise to select a tool simply because it is popular at the moment. That tool just might do the job, but remember that everything that is built must be maintained. If you build something that only you can understand or maintain, you may just have tied yourself to that function for the rest of your career. I have personally created functions that worked properly and were small and reliable. I received telephone calls regarding these for ten years after I left that company because later developers could not understand how the functions were implemented. The documentation I wrote had been lost long earlier.

Each function or service should be considered separately in a distributed solution. Should the function be executed in an enterprise data center, in the data center of a cloud services provider or, perhaps, in both. Consider that there are regulatory requirements in some industries that direct the selection of where and how data must be maintained and stored.

Other considerations include:

What type of system should be the host of that function. Is one system architecture better for that function? Should the system be based upon ARM, X86, SPARC, Precision, Power, or even be a Mainframe?
Does a specific operating system provide a better computing environment for this function? Would Linux, Windows, UNIX, System I, or even System Z be a better platform?
Is a specific development language better for that function? Is a specific type of data management tool? Is a Flat File, SQL database, No-SQL database, or a non-structured storage mechanism better?
Should the function be hosted in a virtual machine or a container to facilitate function mobility, automation and orchestration?

Virtual machines executing Windows or Linux were frequently the choice in the early 2000s. While they offered significant isolation for functions and made it easily possible to restart or move them when necessary, their processing, memory and storage requirements were rather high. Containers, another approach to processing virtualization, are the emerging choice today because they offer similar levels of isolation, the ability to restart and migrate functions and consume far less processing power, memory or storage.

Performance

Performance is another critical consideration. While defining the functions or services that make up a solution, the developers should be aware if they have significant processing, memory or storage requirements. It might be wise to look at these functions closely to learn if that can be further subdivided or decomposed.

Further segmentation would allow an increase in parallelization which would potentially offer performance improvements. The trade off, of course, is that this approach also increases complexity and, potentially, makes them harder to manage and to make secure.

Reliability

In high stakes enterprise environments, solution reliability is essential. The developer must consider when it is acceptable to force people to re-enter data, re-run a function, or when a function can be unavailable.

Database developers ran into this issue in the 1960s and developed the concept of an atomic function. That is, the function must complete or the partial updates must be rolled back leaving the data in the state it was in before the function began. This same mindset must be applied to distributed systems to ensure that data integrity is maintained even in the event of service failures and transaction disruptions.

Functions must be designed to totally complete or roll back intermediate updates. In critical message passing systems, messages must be stored until an acknowledgement that a message has been received comes in. If such a message isn’t received, the original message must be resent and a failure must be reported to the management system.

Manageability

Although not as much fun to consider as the core application functionality, manageability is a key factor in the ongoing success of the application. All distributed functions must be fully instrumented to allow administrators to both understand the current state of each function and to change function parameters if needed. Distributed systems, after all, are constructed of many more moving parts than the monolithic systems they replace. Developers must be constantly aware of making this distributed computing environment easy to use and maintain.

This brings us to the absolute requirement that all distributed functions must be fully instrumented to allow administrators to understand their current state. After all, distributed systems are inherently more complex and have more moving parts than the monolithic systems they replace.

Security

Distributed system security is an order of magnitude more difficult than security in a monolithic environment. Each function must be made secure separately and the communication links between and among the functions must also be made secure. As the network grows in size and complexity, developers must consider how to control access to functions, how to make sure than only authorized users can access these function, and to to isolate services from one other.

Security is a critical element that must be built into every function, not added on later. Unauthorized access to functions and data must be prevented and reported.

Privacy

Privacy is the subject of an increasing number of regulations around the world. Examples like the European Union’s GDPR and the U.S. HIPPA regulations are important considerations for any developer of customer-facing systems.

Mastering Complexity

Developers must take the time to consider how all of the pieces of a complex computing environment fit together. It is hard to maintain the discipline that a service should encapsulate a single function or, perhaps, a small number of tightly interrelated functions. If a given function is implemented in multiple places, maintaining and updating that function can be hard. What would happen when one instance of a function doesn’t get updated? Finding that error can be very challenging.

This means it is wise for developers of complex applications to maintain a visual model that shows where each function lives so it can be updated if regulations or business requirements change.

Often this means that developers must take the time to document what they did, when changes were made, as well as what the changes were meant to accomplish so that other developers aren’t forced to decipher mounds of text to learn where a function is or how it works.

To be successful as a architect of distributed systems, a developer must be able to master complexity.

Approaches Developers Must Master

Developers must master decomposing and refactoring application architectures, thinking in terms of teams, and growing their skill in approaches to rapid application development and deployment (DevOps). After all, they must be able to think systematically about what functions are independent of one another and what functions rely on the output of other functions to work. Functions that rely upon one other may be best implemented as a single service. Implementing them as independent functions might create unnecessary complexity and result in poor application performance and impose an unnecessary burden on the network.

Virtualization Technology Covers Many Bases

Virtualization is a far bigger category than just virtual machine software or containers. Both of these functions are considered processing virtualization technology. There are at least seven different types of virtualization technology in use in modern applications today. Virtualization technology is available to enhance how users access applications, where and how applications execute, where and how processing happens, how networking functions, where and how data is stored, how security is implemented, and how management functions are accomplished. The following model of virtualization technology might be helpful to developers when they are trying to get their arms around the concept of virtualization:

Figure 3: Architure of virtualized systems

Source: 7 Layer Virtualizaiton Model, VirtualizationReview.com

Think of Software-Defined Solutions

It is also important for developers to think in terms of “software defined” solutions. That is, to segment the control from the actual processing so that functions can be automated and orchestrated.

Developers shouldn’t feel like they are on their own when wading into this complex world. Suppliers and open-source communities offer a number of powerful tools. Various forms of virtualization technology can be a developer’s best friend.

Virtualization Technology Can Be Your Best Friend

Containers make it possible to easily develop functions that can execute without interfering with one another and can be migrated from system to system based upon workload demands.
Orchestration technology makes it possible to control many functions to ensure they are performing well and are reliable. It can also restart or move them in a failure scenario.
Supports incremental development: functions can be developed in parallel and deployed as they are ready. They also can be updated with new features without requiring changes elsewhere.
Supports highly distributed systems: functions can be deployed locally in the enterprise data center or remotely in the data center of a cloud services provider.

Think In Terms of Services

This means that developers must think in terms of services and how services can communicate with one another.

Well-Defined APIs

Well defined APIs mean that multiple teams can work simultaneously and still know that everything will fit together as planned. This typically means a bit more work up front, but it is well worth it in the end. Why? Because overall development can be faster. It also makes documentation easier.

Support Rapid Application Development

This approach is also perfect for rapid application development and rapid prototyping, also known as DevOps. Properly executed, DevOps also produces rapid time to deployment.

Think In Terms of Standards

Rather than relying on a single vendor, the developer of distributed systems would be wise to think in terms of multi-vendor, international standards. This approach avoids vendor lock-in and makes finding expertise much easier.

Summary

It’s interesting to note how guidelines for rapid application development and deployment of distributed systems start with “take your time.” It is wise to plan out where you are going and what you are going to do otherwise you are likely to end up somewhere else, having burned through your development budget, and have little to show for it.

Source

March 10, 2019April 6, 2019

Microservices vs. Monolithic Architectures

Enterprises are increasingly pressured by competitors and their own customers to get applications working and online quicker while also minimizing development costs. These divergent goals have forced enterprise IT organization to evolve rapidly. After undergoing one forced evolution after another since the 1960s, many are prepared to take the step away from monolithic application architectures to embrace the microservices approach.

Figure 1: Architecture differences between traditional monolithic applications and microservices

Higher Expectations and More Empowered Customers

Customers that are used to having worldwide access to products and services now expect enterprises to quickly respond to whatever other suppliers are doing.

CIO magazine, in reporting upon Ovum’s research, pointed out:

“Customers now have the upper hand in the customer journey. With more ways to shop and less time to do it, they don’t just gather information and complete transactions quickly. They often want to get it done on the go, preferably on a mobile device, without having to engage in drawn-out conversations.”

IT Under Pressure

This intense worldwide competition also forces enterprises to find new ways to cut costs or find new ways to be more efficient. Developers have seen this all before. This is just the newest iteration of the perennial call to “do more with less” that enterprise IT has faced for more than a decade. Even though IT budgets grow, they’ve learned, the investments are often in new IT services or better communications.

Figure 2: Forcasted 2018 worldwide IT spending growth

Source: Gartner Market Databook, 4Q17

As enterprise IT organizations face pressure to respond, they have had to revisit their development processes. The traditional two-year development cycle, previously acceptable, is no longer satisfactory. There is simply no time for that now.

A Confluence of Trends

Enterprise IT has also been forced to respond to a confluence of trends that are divergent and contradictory.

The introduction of inexpensive but high-performance network connectivity that allows distributed functions to communicate with one another across the network as fast as processes previously could communicate with one another inside of a single system.
The introduction of powerful microprocessors that offer mainframe-class performance in inexpensive and small packages. After standardizing on the X86 microprocessor architecture, enterprises are now being forced to consider other architectures to address their need for higher performance, lower cost, and both lower power consumption and heat production.
Internal system memory capacity continues to increase making it possible to deploy large-scale applications or application components in small systems.
External storage use is evolving away from the use of rotating media to solid state devices to increase capability, reduce latency, decrease overall cost, and deliver enormous capacity.
The evolution of open-source software and distributed computing functions make it possible for the enterprise to inexpensively add a herd of systems when new capabilities are needed rather than facing an expensive and time-consuming forklift upgrade to expand a central host system.
Customers demand instant and easy access to applications and data.

As enterprises address these trends, they soon discover that the approach that they had been relying on — focusing on making the best use of expensive systems and networks — needs to change. The most significant costs are now staffing, power, and cooling. This is in addition to the evolution they made nearly two decades ago when their focus shifted from monolithic mainframe computing to distributed, X86-based midrange systems.

The Next Steps in a Continuing Saga

Here’s what enterprise IT has done to respond to all of these trends.

They are choosing to move from using the traditional waterfall development approach to various forms of rapid application development. They also are moving away from compiled languages to interpreted or incrementally compiled languages such as Java, Python, or Ruby to improve developer productivity.

IDC, for example, predicts that:

“By 2021 65% of CIOs will expand agile/DevOps practices into the wider business to achieve the velocity necessary for innovation, execution, and change.”

Complex applications are increasingly designed as independent functions or “services” that can be hosted in several places on the network to improve both performance and application reliability. This approach means that it is possible to address changing business requirements as well as to add new features in one function without having to change anything else in parallel. NetworkWorld’s Andy Patrizio pointed out in his predictions for 2019 that he expects “Microservices and serverless computing take off.”

Another important change is that these services are being hosted in geographically distributed enterprise data centers, in the cloud, or both. Furthermore, functions can now reside in a customer’s pocket or in some combination of cloud-based or corporate systems.

What Does This Mean for You?

Addressing these trends means that enterprise developers and operations staff have to make some serious changes to their traditional approach including the following:

Developers must be willing to learn technologies that better fits today’s rapid application development methodology. An experienced “student” can learn quickly through online schools. For example, Learnpython.org offers free courses in Python, while codecademy offers free courses in Ruby, Java, and other languages.
They must also be willing to learn how to decompose application logic from a monolithic, static design to a collection of independent, but cooperating, microservices. Online courses are available for this too. One example of a course designed to help developers learn to “think in microservices” comes from IBM. Other courses are available from Lynda.com.
Developers must adopt new tools for creating and maintaining microservices that support quick and reliable communication between them. The use of various commercial and open-source messaging and management tools can help in this process. Rancher Labs, for example, offers open-source software for delivering Kurbernetes-as-a-service.
Operations professionals need to learn orchestration tools for containers and Kubernetes to understand how they allow teams to quickly develop and improve applications and services without losing control over data and security. Operations has long been the gatekeepers for enterprise data centers. After all, they may find their positions on the line if applications slow down or fail.
Operations staff must allow these functions to be hosted outside of the data centers they directly control. To make that point, analysts at Market Research Future recently published a report saying that, “the global cloud microservices market was valued at USD 584.4 million in 2017 and is expected to reach USD 2,146.7 million by the end of the forecast period with a CAGR of 25.0%”.
Application management and security issues must now be part of developers’ thinking. Once again, online courses are available to help individuals to develop expertise in this area. LinkedIn, for example, offers a course in how to become an IT Security Specialist.

It is important for both IT and operations staff to understand that the world of IT is moving rapidly and everyone must be focused on upgrading their skills and enhancing their expertise.

How Do Microservices Benefit the Enterprise?

This latest move to distributed computing offers a number of real and measurable benefits to the enterprise. Development time and cost can be sharply reduced after the IT organization incorporates this form of distributed computing. Afterwards, each service can be developed in parallel and refined as needed without requiring an entire application to be stopped or redesigned.

The development organization can focus on developer productivity and still bring new application functions or applications online quickly. The operations organization can focus on defining acceptable rules for application execution and allowing the orchestration and management tools to enforce them.

What New Challenges Do Enterprises Face?

Like any approach to IT, the adoption of a microservices architecture will include challenges as well as benefits.

Monitoring and managing many “moving parts” can be more challenging than dealing with a few monolithic applications. The adoption of an enterprise management framework can help address these challenges. Security in this type of distributed computing needs to be top of mind as well. As the number of independent functions grows on the network, each must be analyzed and protected.

Should All Monolithic Applications Migrate to Microservices?

Some monolithic applications can be difficult to change. This may be due to technological challenges or may be due to regulatory constraints. Some components in use today may have come from defunct suppliers, making changes difficult or impossible.

It can be both time consuming and costly for the organization to go through a complete audit process. Often, organizations continue investing in older applications much longer than is appropriate in the belief that they’re saving money.

It is possible to evaluate what an monolithic application does to learn if some individual functions can be separated and run as smaller, independent services. These can be implemented either as cloud-based services or as container-based microservices.

Rather than waiting and attempting to address older technology as a whole, it may be wise to undertake a series of incremental changes to make enhancing or replacing an established system more acceptable. This is very much like the old proverb, “the best time to plant a tree was 20 years ago. The second best time is now.”

Is the Change Worth It?

Enterprises that have made the move towards the adoption of microservices-based application architectures have commented that their IT costs are often reduced. They also often point out that once their team mastered this approach, it was far easier and quicker to add new features and functions when market demands changed.

Source

March 9, 2019April 6, 2019

A Comparison of VMware and Docker

Servers are expensive. And in single-application installations, most servers spend the majority of their time waiting. Making the most of these expensive assets led to virtualization, and making the most of virtualization has led to multiple options for virtualizing applications.

VMware and Docker offer competing methods for virtualizing applications. Both technologies work to make the most of limited hardware resources, but they do so in significantly different ways. This post will help you understand how they differ and how those differences affect which scenarios each is best suited for. In particular, we’ll take a brief look at how each works, what the differences mean for the application and the deploying team, and how those differences can have an impact on operations, security, and application performance.

This article is aimed at both IT operations and application development leaders who want to expand the options in their deployment toolkit. The information will help those leaders make more informed decisions and explain those decisions to colleagues and executives.

The Limits of Virtualization

VMware is a company with a wide variety of products, from those that virtualize a single application to those that manage entire data centers or clouds. In this article, we use “VMware” to refer to VMware vSphere, used to virtualize entire operating systems; many different operating systems, from various Linux distributions to Windows Server can be virtualized on a single physical server.

VMware is a type-1 hypervisor, meaning it sits between the virtualized operating system and the server hardware; a number of different operating systems can run on a single VMware installation, with OS-specific applications running on each OS instance.

Docker is a system for orchestrating, or managing, application containers. An application container virtualizes an application and the software libraries, services, and operating system components required to run it. All of the Docker containers in a deployment will run on a single operating system because they’ll share commonly used resources from that operating system. Sharing the resources means that the application container is much smaller than the full virtualized operating system created in VMware. That smaller software image in a container can typically be created much more quickly than the VMware operating system image — on the scale of seconds rather than minutes.

The key question for the deployment team is why virtualization is being considered. If the point of the shift is at the operating system level — to provide each user or user population with its own operating environment while requiring as few physical servers as possible — then VMware is the logical choice. If the focus is on the application, with the operating system hidden or irrelevant to the user, then Docker containers become a realistic option for deployment.

The Scale of Reuse

How much of each application do you want to reuse? The methods and scales of resource sharing are different for VMware and Docker containers, as one reuses images of operating systems while the other shares functions and resources from a single image. Those differences can translate to huge storage and memory requirements for applications.

Each time VMware creates a instance of an operating system, it creates a full copy of that operating system. All of the components of the operating system, and any resources used by applications running within the instance, are used only within that particular instance — there is no sharing among running operating systems. This means that there can be incredible customization of the environment within each operating system and applications can be run without concern about effecting (or being effected by) applications running in other virtual operating systems.

When a Docker container is created, it is a unique instance of the application with all of the libraries and code the application depends on included. While the application code is bundled within the container image, the application relies on — and is managed by — the host system’s kernel. This reduces the resources required to run containers and allows them to start very quickly.

Docker’s speed at creating new instances of an application makes it a solution commonly used in the development environment, where quickly launching, testing, and deleting copies of an application can make for much greater efficiencies. VMware’s ability to author a single “golden copy” of a fully patched and updated operating system and then use that image to create every new instance makes it popular in enterprise production deployments.

In both VMware and Docker containers, a “master copy” of the original environment is created and used to deploy multiple copies. The question for the operations team is whether the resource efficiency of Docker matches the needs of the application and the user base, or whether those needs require a unique copy of the operating system to be launched and deployed for each instance.

Automation as a Principle

While the processes of creating and tearing down operating system images can be automated, automation is baked into the very heart of Docker. Orchestration, as part of the DevOps toolbox, is a major differentiator for Docker containers versus VMware.

Docker is itself the orchestration mechanism for creating new application instances on demand and then shutting them down when the requirement ends. There are API integrations that allow Docker to be controlled by a number of different automation systems. And for large computing environments that use Docker containers, additional layers of automation and management have been developed. One well-known platform is Kubernetes, which was developed to manage clusters of Docker containers that may be spread across many different servers.

VMware has a wide variety of automation tools as well, but those tools are, when discussing the vSphere family of products, responsible for creating new instances of operating systems, not applications. This means that the time to create an entirely new operating system image must be considered when planning rapid-response cloud and virtual system application environments. VMware can certainly work to support those environments; it’s used in many commercial operations to do just that. But it requires additional applications or frameworks to automate and orchestrate the process, adding complexity and expense to the solution.

It’s important to note that both Docker containers and VMware can operate quite successfully without automation. When it comes to a commercial installation, though, each becomes much more powerful when the tasks of creating and deleting new operating system and application instances are controlled by software rather than human hands. From rapid response to increased user demand, to large-scale automated application testing, system automation is important. Knowing what’s required for that automation is critical when deciding between technologies.

Separation — or Not

If speed of deployment and execution or limitations on resource usage aren’t critical differentiators for your deployments, then hard separation between applications and instances might be. Just as orchestration is baked into Docker, separation is baked into VMware.

Each instance of an operating system virtualized under VMware is a complete operating system image running on hardware resources that are not shared logically with any other instance of the operating system. VMware partitions the hardware resources in ways that make each operating system instance believe that it’s the only OS running on the server.

This means that, barring a critical hypervisor vulnerability, there is no realistic way for an application running on one virtual server to reach across into another virtual server for data or resources. It also means that things can go awfully, terribly wrong in one virtual server and it can be shut down without endangering the operation of any of the other virtual servers running under VMware.

While proponents of Docker have spoken of similar separation being part of the container system’s architecture, recent vulnerability reports (such as CVE-2019-5736) indicate that Docker’s separation might not be as complete as operational IT specialists would hope.

Separation is not as high of a priority for Docker containers as it is for VMware operating system instances. Application containers will share resources; and where there is sharing, there are limits on separation.

Conclusion

There are significant differences between the virtualization and deployment of VMware and Docker, each with its uses. Readers should now have a basic understanding of the basic nature and capability of each platform and of the factors that could make each preferable in a given situation.

Where speed of deployment and most effective use of limited resources are the highest priorities, Docker containers show a great deal of strength. In situations like development groups or the rapid iteration of a fully functioning DevOps environment, containers can be tremendously valuable.

If security and stability are critically important in your production environment, VMware offers both. For both Docker containers and VMware, multiple products are available to extend their functionality through automation, orchestration, and other functions.

You can find more information on deploying Docker in this blog post. The article presents both best practices and hands-on details for putting the platforms in the field, as well as information on how to include each within a DevOps methodology.

Source

March 8, 2019April 6, 2019

A Reflection on the Kubernetes Market

Running a young and growing company in the Kubernetes space means travelling at high speed in an ever-changing market. We are heading into our fourth year of business, and around this time of year I like to step back from the noise and figure out some of the larger trends I’m seeing develop.

I am not a technologist by background, so my thoughts tend to be more commercial in nature. If you’re interested, I wrote a similar post last year.

1.) Kubernetes is Dead, Long Live Kubernetes

Kubernetes continues to sweep the market. In fact, it’s gathering pace as a buzzword. I’m regularly told by tech leaders that if they didn’t use Kubernetes, they wouldn’t find engineers to work with them!

Although this fervour has been a bemusing aspect of the rise of Kubernetes, for the early adopters, I recognise a sense that Kubernetes is no longer the ‘thing’. Their attitude is now one of focusing on how they unlock the value of running services on Kubernetes. Not for PoCs or test services, but now as a kernel for their whole technology platform.

As Kubernetes matures, it will inevitably be commoditised, but crucially it will increase in importance as we rely on it to run entire businesses.

Matt-next

Matt Bates speaking at Google NEXT 2018

2.) We’re Moving up the Stack

Jetstack started as a way for companies to access high quality expertise around Kubernetes and Docker.

Entering the market early has given us a wonderful opportunity to grow with the community. Our company reflects the maturity of the ecosystem, just as an open source project strengthens with users and contributions.

At last, we are able to have conversations about the more holistic benefits of cloud native. Smart teams no longer worry about Kubernetes as a decision they should or shouldn’t take; they can now think about maximising the value their users will get out of it. They are also starting to understand how Kubernetes and related technologies can help them to build a platform that enables them to innovate and compete. In a number of cases, this means entering new markets and creating new product lines.

It’s taken longer than expected, but the thought that we would get to a standardised ‘stack’ with Kubernetes as the foundation was always the hope. We’re now consistently seeing certain ‘de-facto’ technologies in conversations with customers (i.e. Prometheus, Calico), and others that are being mentioned regularly (i.e. Istio, Spinnaker).

The idea that we may be able to offer a more standardised set of products formed around Kubernetes was even something we were considering back in 2014 when naming the company – the Jet ‘Stack’.

However, the exact mix of products you will need in your company will likely always need refinement based on requirements and developments in an ever-evolving ecosystem. In 2019 we’ll introduce a formalised approach to this discovery, and a reference architecture to help navigate the complexity. Our goal is to accelerate cloud native adoption and success in the same way we do so for Kubernetes.

team-retro

Kubernetes Operational Wargaming at NEXT 2018

3.) The Trough of Despondency

No matter how good Kubernetes is, confusion still abounds on how to deploy and operate it.

A look at the Cloud Native Landscape shows the difficulties involved in choosing a path, and this plays out with customers at all stages of the journey.

Questions abound:

Is EKS mature enough?
How do I secure it?
Do I consider a GitOps approach?
Is Azure a viable platform on which to run K8s?
Does Google have the enterprise knowledge to support me?
Is bare metal too complex to try?
Is consistent multi-cloud operation actually feasible?
Should I go kops or is kubeadm worth trying now?

With more vendors entering the market and marketing efforts ramping up, this confusion seems only to be increasing. In some instances, we’ve noticed elements of a backlash to Kubernetes’ complexity of software and ecosystem.

If Kubernetes continues to grow as a buzzword, so will it’s propensity to be sold as a ‘silver bullet’, and frustration will abound as enterprises realise just how much goes into a production-ready cluster.

Fortunately amongst business technologists, these developments just seem to be taken as an inevitable part of wider adoption and maturing of the technology.

Most people I see frustrated by the short term issues of even finding Kubernetes talent recognise the long-term value that Kubernetes can provide. One thing I often hear is that Kubernetes has the opportunity to take on the breadth of success seen by Linux or virtual machines and we only saw similar patterns in those.

The advice for now? Just keep going.

pingpong

Jetstack ping pong championships, Dublin 2018

Source

March 8, 2019April 6, 2019

Challenges and Solutions for Scaling Kubernetes in the Hybrid Cloud

Introduction

Let’s assume you’re in business online: you have your own datacenter and a private cloud running your website. You’ll have a number of servers deployed to run applications and store their data.

The overall website traffic in this scenario is pretty constant, yet there are times where you expect traffic growth. How do you handle all this growth in traffic?

The first thing that comes to mind is that you need to be able to scale some of your applications in order to cope with the traffic increase. As you don’t want to spend money on new hardware, which you’ll use only a few times per year, you think of moving to a hybrid cloud set up.

This can be a real time and cost saver. Scaling (parts of) your application to public cloud will allow you to pay for only the resources you use, for the time you use them.

But how do you choose that public cloud, and can you choose more than one?

The short answer is yes, you’ll most likely need to choose more than one public cloud provider. Because you have different teams, working on different applications, having different requirements, one cloud provider may not fit all your needs. In addition, many organizations need to follow certain laws, regulations and policies which dictate that their data must physically reside in certain locations. A strategy of using more than one public cloud can help organizations meet those stringent and varied requirements. They can also select from multiple data center regions or availability zones, to be as close to their end users as possible, providing them optimal performance and minimal latency.

Challenges of scaling across multiple cloud providers

You’ve decided now upon the cloud(s) to use, so let’s go back and think about the initial problem. You have an application with a microservice deployment architecture for your applications, running containers that need to be scaled. Here is where Kubernetes comes into play. Essentially Kubernetes is a solution which helps you manage and orchestrate containerized applications in a cluster of nodes. Although Kubernetes will help you manage and scale deployments, nodes and clusters, it won’t help you easily manage and scale them across cloud providers. More on that later.

A Kubernetes cluster is a set of machines (physical/virtual), resourced by Kubernetes to run applications. Essential Kubernetes concepts that you need to understand for our purposes are:

Pods are units that control one or more containers, scheduled as one application. Typically you should create one Pod per application, so you can scale and control them separately.
Node components are worker machines in Kubernetes. A node may be a virtual machine (VM) or physical machine, depending on the cluster. Each node contains the services necessary to run pods and is managed by the master components.
Master components manage the lifecycle of a Pod. If a Pod dies, the Controller creates a new one, if you scale up/down Pods, the Controller creates/destroys your Pods. More on the controller types you can find here

The role of these three components is to scale and schedule containers. The master component dictates the scheduling and scaling commands. The nodes then orchestrate the pods accordingly.

These are only the basics of Kubernetes, for a more detailed understanding, you can check our Intro to Kubernetes article.

There are a few key challenges that come to mind when trying to use Kubernetes to solve our scaling problem across multiple clouds problem:

Difficult to manage multiple clouds, multiple clusters, set users, set policies
Complexity of installation and configuration
Different experiences for users/teams depending on environment

Here’s where Rancher can help you. Rancher is an open source container manager used to run Kubernetes in production. Below are some features that Rancher provides that help us manage and scale our applications regardless of whether the compute resources are hosted on-prem or across multiple clouds:

common infrastructure management across multiple clusters and clouds
easy-to-use interface for Kubernetes configuration and deployment
easy to scale Pods and clusters with a few simple clicks
access control and user management (ldap, AD)
workload, RBAC, policy and project management

Rancher becomes your single point of control for multiple clusters, running on multiple clouds, on pretty much any infrastructure that can run Kubernetes.

Let’s see now how we can use Rancher in order to manage more than one cluster, in two different regions.

Starting a Rancher 2.0 instance

To begin, start a Rancher 2.0 instance. There is a very intuitive getting started guide for this purpose here.

Hands-on with Rancher and Kubernetes

Let’s create two hosted Kubernetes clusters in GCP, in two different regions. For this you will need a service account key.

In Global Tab, we can see all the available clusters and their state. From Provisioning state, when ready, they should turn to Active.

A number of pods are already deployed to each node from your Kubernetes Cluster. Those pods are used by Kubernetes and Rancher’s internal systems.

Let’s proceed by deploying Workloads for both the clusters. Sequentially select Default project; this will open the Workloads tab. Click on Deploy and set the Name and the Docker image to be httpd for the first cluster, nginx for the second one, since we want to expose our webservers to internet traffic in the Port mapping area select aLayer-4 Load Balancer`.

If you click on nginx/httpd workload, you will see that Rancher actually created a Deployment just as recommended by Kubernetes to manage ReplicaSets. You will also see the Pod created by that ReplicaSet.

Scaling Pods and clusters

Our Rancher instance is managing two clusters:

us-east1b-cluster, running 5 httpd Pods
europe-west4-a cluster, running 1 nginx Pod

Let’s scale down some httpd Pods by clicking on – under Scale column. In no time we see the number of Pods decrease.

To scale up Pods, click + under Scale column. Once you do that, you should instantly see Pods being created and ReplicaSet scaling events. Try to delete one of the pods, by using the right-hand side menu of the Pod and notice how ReplicaSet is recreating it back, to match the desired state.

So, we went from 5 httpd Pods to 2 for first cluster, and from 1 nginx Pod to 7 Pods for second one. Second cluster looks now almost to be running out of resources.

From Rancher we can also scale the cluster by adding extra Nodes. Let’s try do that, let’s edit the node count to 5.

While Rancher shows us “reconciling cluster,” Kubernetes behind the scenes is actually upgrading the cluster master and resizing the node pool.

Give this action some time and eventually you should see 5 nodes up and running.

Let’s check the Global tab, so we can have an overview of all the clusters Rancher is managing.

Now we can add more Pods if we want as there are new resources available, let’s go to 13.

Most importantly, any of these operations is performed with no downtime. While scaling Pods up or down, or resizing the cluster, hitting the public IP for httpd/nginx Deployment the HTTP response status code was all the time 200.

Conclusion

Let’s recap our hands-on scaling exercise:

we created two clusters using Rancher
we deployed workloads having a deployment of 1 Pod (nginx) and a deployment of 5 Pods (httpd)
scaled in/out those two deployments
resized the cluster

All of these actions were done with a few simple clicks, from Rancher, making use of the friendly and intuitive UI. Of course, you can do this entirely from the API as well. In either case, you have a single central point from where you can manage all your kubernetes clusters, observe their state or scale Deployments if needed. If you are looking for a tool to help you with infrastructure management and container orchestration in a hybrid/multi-cloud, multi-region clusters, then Rancher might be the perfect fit for you.

Source

March 8, 2019April 6, 2019

Deploying JFrog Artifactory with Rancher, Part One

JFrog Artifactory is a universal artifact repository that supports all major packaging formats, build tools and continuous integration (CI) servers. It holds all of your binary content in a single location and presents an interface that makes it easy to upload, find, and use binaries throughout the application development and delivery process.

In this article we’ll walk through using Rancher to deploy and manage JFrog Artifactory on a Kubernetes cluster. When you have finished reading this article, you will have a fully functional installation of JFrog Artifactory OSS, and you can use the same steps to install the OSS or commercial version of Artifactory in any other Kubernetes cluster. We’ll also show you how to create a generic repository in Artifactory and upload artifacts into it.

Artifactory has many more features besides the ones presented in this article, and a future article will explore those in greater detail.

Let’s get started!

Software Used

This article uses the following software:

Rancher v2.0.8
Kubernetes cluster running on Google Kubernetes Engine version 1.10.7-gke.2
Artifactory helm chart version 7.4.2
Artifactory OSS version 6.3.2

If you’re working through the article at a future date, please use the versions current for that time.

As with all things Kubernetes, there are multiple ways to install Artifactory. We’re going to use the Helm chart. Helm provides a way to package application installation instructions and share them with others. You can think of it as a package manager for Kubernetes. Rancher integrates with Helm via the Rancher Catalog, and through the Catalog you can deploy any Helm-backed application with only a few clicks. Rancher has other features, including:

an easy and intuitive web interface
the ability to manage Kubernetes clusters deployed anywhere, on-premise or with any provider
a single view into all managed clusters
out of the box monitoring of the clusters
workload, role-based access control (RBAC), policy and project management
all the power of Kubernetes without the need to install any software locally

Installing Rancher

NOTE: If you already have a Rancher v2 server and Kubernetes cluster installed, skip ahead to the section titled Installing JFrog Artifactory.

We’re proud of Rancher’s ability to manage Kubernetes clusters anywhere, so we’re going to launch a Rancher Server in standalone mode on a GCE instance and use it to deploy a Kubernetes cluster in GKE.

Spinning up a Rancher Server in standalone mode is easy – it’s a Docker container. Before we can launch the container, we’ll need a compute instance on which to run it. Let’s launch that with the following command:

gcloud compute –project=rancher-20 instances create rancher-instance
–zone=europe-west2-c
–machine-type=g1-small
–tags=http-server,https-server
–image=ubuntu-1804-bionic-v20180911
–image-project=ubuntu-os-cloud

Please change the project and zone parameters as appropriate for your deployment.

After a couple of minutes you should see that your instance is ready to go.

Created [https://www.googleapis.com/compute/v1/projects/rancher-20/zones/europe-west2-c/instances/rancher-instance].
NAME ZONE MACHINE_TYPE INTERNAL_IP EXTERNAL_IP STATUS
rancher-instance europe-west2-c g1-small 10.154.0.2 35.242.185.165 RUNNING

Make a note of the EXTERNAL_IP address, as you will need it in a moment to connect to the Rancher Server.

With the compute node up and running, let’s use the GCE CLI to SSH into it.

gcloud compute ssh
–project “rancher-20”
–zone “europe-west2-c”
“rancher-instance”

Again, be sure that you adjust the project and zone parameters to reflect your instance if you launched it in a different zone or with a different name.

Once connected, run the following commands to install some prerequisites and then install Docker CE. Because the Rancher Server is a Docker container, we need Docker installed in order to continue with the installation.

sudo apt-get update
sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add –
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository “deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable”
sudo apt-get update
sudo apt-get -y install docker-ce

With that out of the way, we’re ready to deploy the Rancher Server. When we launch the container for the first time, the Docker Engine will fetch the container image from Docker Hub and store it locally before launching a container from it. Future launches of the container, should we need to relaunch it, will use the local image store and be much faster.

Use the next command to instruct Docker to launch the Rancher Server container and have it listen on port 80 and 443 on the host.

sudo docker run -d –restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher:v2.0.8

If nothing goes awry, Docker will print the download status and then the new container ID before returning you to a prompt.

Unable to find image ‘rancher/rancher:latest’ locally
latest: Pulling from rancher/rancher
124c757242f8: Pull complete
2ebc019eb4e2: Pull complete
dac0825f7ffb: Pull complete
82b0bb65d1bf: Pull complete
ef3b655c7f88: Pull complete
437f23e29d12: Pull complete
52931d58c1ce: Pull complete
b930be4ed025: Pull complete
4a2d2c2e821e: Pull complete
9137650edb29: Pull complete
f1660f8f83bf: Pull complete
a645405725ff: Pull complete
Digest: sha256:6d53d3414abfbae44fe43bad37e9da738f3a02e6c00a0cd0c17f7d9f2aee373a
Status: Downloaded newer image for rancher/rancher:latest
454aa51a6f0ed21cbe47dcbb20a1c6a5684c9ddb2a0682076237aef5e0fdb3a4

Congratulations! You’ve successfully launched a Rancher Server instance.

Use the EXTERNAL_IP address that you saved above and connect to that address in a browser. You’ll be asked to accept the self-signed certificate that Rancher installs by default. After this, you’ll be presented with the welcome screen. Set a password (and remember it!), and continue to the next page.

Welcome to Rancher

On this page you’re asked to set the URL for the Rancher Server. In a production deployment this would be a hostname like rancher.yourcompany.com, but if you’re following along with a demo server, you can use the EXTERNAL_IP address from above.

Rancher Server URL

When you click Save URL on this page, you’ll be taken to the Clusters page, and from there we’ll deploy our Kubernetes cluster.

Using Rancher to Deploy a GKE Cluster

Rancher can deploy and manage Kubernetes clusters anywhere. They can be in Google, Amazon, Azure, on cloud nodes, in datacenters, or even running in a VM on your laptop. It’s one of the most powerful features of the product. For today we’ll be using GKE, so after clicking on Add Cluster, choose Google Container Engine as your provider.

Set the name to something appropriate for this demo, like jfrog-artifactory.

In order to create the cluster, Rancher needs permission to access the Google Cloud Platform. Those permissions are granted via a Service Account private key JSON file. To generate that, first find the service account name (replace the project name with yours if necessary):

gcloud iam service-accounts list –project rancher-20

NAME EMAIL
Compute Engine default service account <SA>-compute@developer.gserviceaccount.com

The output will have a service account number in place of <SA>. Copy this entire address and use it in the following command:

gcloud iam service-accounts keys create ./key.json
–iam-account <SA>-compute@developer.gserviceaccount.com

This will create a file named key.json in the current directory. This is the Service Account private key that Rancher needs to create the cluster:

Add Cluster Rancher

You can either paste the contents of that file into the text box, or you can click Read from a file and point it to the key.json file. Rancher will use this info to generate a page wherein you can configure your new cluster:

Add Cluster Rancher second step

Set your preferred Zone, Machine Type, Node Count and Root Disk Size. The values presented in the above screenshot are sane defaults that you can use for this demo.

When you click Create, the cluster will be provisioned in GKE, and when it’s ready, you’ll see it become active in the UI:

Rancher cluster view

Installing JFrog Artifactory

We’ll install Artifactory by using the Helm chart repository from JFrog. Helm charts, like OS package management systems, give you a stable way to deploy container applications into Kubernetes, upgrade them, or roll them back. The chart guarantees that you’re installing a specific version or tag for the container, and where applications have multiple components, a Helm chart assures that you’re getting the right version for all of them.

Installing the JFrog Helm Repository

Rancher ships with a library of Helm charts in its Application Catalog, but in keeping with the Rancher objective of user flexibility, you can install any third-party Helm repository to have those applications available for deployment in your cluster. We’ll use this today by installing the JFrog repository.

In the Global Cluster view of Rancher click on Catalogs and then click on Add Catalog. In the window that opens, enter a name that makes sense, like jfrog-artifactory and then enter the location of the official JFrog repository.

Rancher add catalog

Click on Create, and the JFrog repository will will appear in the list of custom catalogs.

Rancher list of Catalogs

Deploying Artifactory

We’re ready to deploy Artifactory. From the Global view, select the Default project under the jfrog-artifactory cluster:

Rancher default project

Once you are inside of the Default project, select Catalog Apps, and then click on Launch. Rancher will show you the apps available for installation from the Application Catalogs. You’ll notice that artifactory-ha shows up twice, once as a partner-provided chart within the default Library of apps that ship with Rancher, and again from the JFrog repository itself. We installed the Helm repository because we want to install the regular, non-HA Artifactory, which is just called artifactory. All catalog apps indicate which library they come from, so in a situation where a chart is present in multiple libraries, you can still choose which to install.

Rancher select app from Catalog

When you select View Details, you have the opportunity to change items about how the application is installed. By default this catalog item will deploy the licensed, commercial version of Artifactory, for which you need a license. If you have a license, then you can leave the default options as they are; however, because we want to install the OSS version, we’re going to change the image that the chart installs.

We do this under the Configuration Options pane, by selecting Add Answer. Set a variable name of artifactory.image.repository and a value of docker.bintray.io/jfrog/artifactory-oss.

Catalog app set Answer

Now, when you click Launch, Rancher will deploy Artifactory into your cluster.

Rancher Deploying Artifactory

When the install completes, the red line will change to green. After this happens, if you click on artifactory, it will present you with the resources that Rancher created for you. In this case, it created three workloads, three services, one volume and one secret in Kubernetes.

If you select Workloads, you will see all of them running:

Rancher Artifactory workloads

Resolving a Pending Ingress

At the time of this article’s publication, there is a bug that results in the Ingress being stuck in a Pending state. If you see this when you click on Load Balancing, continue reading for the solution.

Rancher Pending LoadBalancer

To resolve the pending Ingress, we need to create the Service to which the Ingress is sending traffic. Click Import YAML in the top right, and in the window that opens, paste the following information and then click Import.

apiVersion: v1
kind: Service
metadata:
labels:
app: artifactory
chart: artifactory-7.4.2
component: nginx
heritage: Tiller
io.cattle.field/appId: artifactory
release: artifactory
name: artifactory-artifactory-nginx
namespace: artifactory
spec:
externalTrafficPolicy: Local
ports:
– name: nginxhttp
port: 80
protocol: TCP
targetPort: 80
– name: artifactoryhttps
port: 443
protocol: TCP
targetPort: 443
selector:
app: artifactory
component: nginx
release: artifactory
sessionAffinity: None
type: LoadBalancer

Rancher import YAML

Accessing Artifactory

The Workloads pane will now show clickable links for ports 443/tcp and 80/tcp under the artifactory-artifactory-nginx workload:

Workloads clickable ports

When you select 443/tcp, it will open the Artifactory UI in a new browser tab. Because it’s using a self-signed certificate by default, your browser may give you a warning and ask you to accept the certificate before proceeding.

Welcome to JFrog Artifactory

Taking Artifactory for a Spin

You now have a fully-functional binary artifact repository available for use. That was easy! Before you can start using it, it needs a tiny bit of configuration.

First, set an admin password in the wizard. When it asks you about the proxy server, select Skip unless you’ve deployed this in a place that needs proxy configuration. Create a generic repository, and select Finish.

Now, let’s do a quick walkthrough of some basic usage.

First, we’ll upload the helm chart that you used to create the Artifactory installation.

Select Artifacts from the left-side menu. You will see the generic repository that you created above. Choose it, and then from the upper right corner, select Deploy. Upload the Helm chart zipfile (or any other file) to the repository.

Deploy file to Artifactory

After the deploy finishes, you will see it in the tree under the repository.

Artifactory Repository Browser

Although this is a simple test of Artifactory, it demonstrates that it can already can be used in its full capacity.

You’re all set to use Artifactory for binary artifact storage and distribution and Rancher for easy management of the workloads, the cluster, and everything related to the deployment itself.

Cleanup

If you’ve gone through this article as a demo, you can delete the Kubernetes cluster from the Global Cluster view within Rancher. This will remove it from GKE. After doing so, you can delete the Rancher Server instance directly from GCE.

Closing

JFrog Artifactory is extremely powerful. More organizations use it every day, and being able to deploy it quickly and securely into a Kubernetes cluster is useful knowledge.

According to their own literature, Artifactory empowers you to “release fast or die.” Similarly, Rancher allows you to deploy fast while keeping control of the resources and the security around them. You can build, deploy, tear down, secure, monitor, and interact with Kubernetes clusters anywhere in the world, all from a single, convenient, secure interface.

It doesn’t get much easier than that.

Source

March 8, 2019April 6, 2019

Raw Block Volume support to Beta

Kubernetes v1.13 moves raw block volume support to beta. This feature allows persistent volumes to be exposed inside containers as a block device instead of as a mounted file system.

What are block devices?

Block devices enable random access to data in fixed-size blocks. Hard drives, SSDs, and CD-ROMs drives are all examples of block devices.

Typically persistent storage is implemented in a layered maner with a file system (like ext4) on top of a block device (like a spinning disk or SSD). Applications then read and write files instead of operating on blocks. The operating systems take care of reading and writing files, using the specified filesystem, to the underlying device as blocks.

It’s worth noting that while whole disks are block devices, so are disk partitions, and so are LUNs from a storage area network (SAN) device.

Why add raw block volumes to kubernetes?

There are some specialized applications that require direct access to a block device because, for example, the file system layer introduces unneeded overhead. The most common case is databases, which prefer to organize their data directly on the underlying storage. Raw block devices are also commonly used by any software which itself implements some kind of storage service (software defined storage systems).

From a programmer’s perspective, a block device is a very large array of bytes, usually with some minimum granularity for reads and writes, often 512 bytes, but frequently 4K or larger.

As it becomes more common to run database software and storage infrastructure software inside of Kubernetes, the need for raw block device support in Kubernetes becomes more important.

Which volume plugins support raw blocks?

As of the publishing of this blog, the following in-tree volumes types support raw blocks:

AWS EBS
Azure Disk
Cinder
Fibre Channel
GCE PD
iSCSI
Local volumes
RBD (Ceph)
Vsphere

Out-of-tree CSI volume drivers may also support raw block volumes. Kubernetes CSI support for raw block volumes is currently alpha. See documentation here.

Kubernetes raw block volume API

Raw block volumes share a lot in common with ordinary volumes. Both are requested by creating PersistentVolumeClaim objects which bind to PersistentVolume objects, and are attached to Pods in Kubernetes by including them in the volumes array of the PodSpec.

There are 2 important differences however. First, to request a raw block PersistentVolumeClaim, you must set volumeMode = “Block” in the PersistentVolumeClaimSpec. Leaving volumeMode blank is the same as specifying volumeMode = “Filesystem” which results in the traditional behavior. PersistentVolumes also have a volumeMode field in their PersistentVolumeSpec, and “Block” type PVCs can only bind to “Block” type PVs and “Filesystem” PVCs can only bind to “Filesystem” PVs.

Secondly, when using a raw block volume in your Pods, you must specify a VolumeDevice in the Container portion of the PodSpec rather than a VolumeMount. VolumeDevices have devicePaths instead of mountPaths, and inside the container, applications will see a device at that path instead of a mounted file system.

Applications open, read, and write to the device node inside the container just like they would interact with any block device on a system in a non-containerized or virtualized context.

Creating a new raw block PVC

First, ensure that the provisioner associated with the storage class you choose is one that support raw blocks. Then create the PVC.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
– ReadWriteMany
volumeMode: Block
storageClassName: my-sc
resources:
requests:
storage: 1Gi

Using a raw block PVC

When you use the PVC in a pod definition, you get to choose the device path for the block device rather than the mount path for the file system.

apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
– name: my-container
image: busybox
command:
– sleep
– “3600”
volumeDevices:
– devicePath: /dev/block
name: my-volume
imagePullPolicy: IfNotPresent
volumes:
– name: my-volume
persistentVolumeClaim:
claimName: my-pvc

As a storage vendor, how do I add support for raw block devices to my CSI plugin?

Raw block support for CSI plugins is still alpha, but support can be added today. The CSI specification details how to handle requests for volume that have the BlockVolume capability instead of the MountVolume capability. CSI plugins can support both kinds of volumes, or one or the other. For more details see documentation here.

Issues/gotchas

Because block devices are actually devices, it’s possible to do low-level actions on them from inside containers that wouldn’t be possible with file system volumes. For example, block devices that are actually SCSI disks support sending SCSI commands to the device using Linux ioctls.

By default, Linux won’t allow containers to send SCSI commands to disks from inside containers though. In order to do so, you must grant the SYS_RAWIO capability to the container security context to allow this. See documentation here.

Also, while Kubernetes is guaranteed to deliver a block device to the container, there’s no guarantee that it’s actually a SCSI disk or any other kind of disk for that matter. The user must either ensure that the desired disk type is used with his pods, or only deploy applications that can handle a variety of block device types.

How can I learn more?

Check out additional documentation on the snapshot feature here: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#raw-block-volume-support

How do I get involved?

Join the Kubernetes storage SIG and the CSI community and help us add more great features and improve existing ones like raw block storage!

https://github.com/kubernetes/community/tree/master/sig-storage
https://github.com/container-storage-interface/community/blob/master/README.md

Special thanks to all the contributors who helped add block volume support to Kubernetes including:

Ben Swartzlander (https://github.com/bswartz)
Brad Childs (https://github.com/childsb)
Erin Boyd (https://github.com/erinboyd)
Masaki Kimura (https://github.com/mkimuram)
Matthew Wong (https://github.com/wongma7)
Michelle Au (https://github.com/msau42)
Mitsuhiro Tanino (https://github.com/mtanino)
Saad Ali (https://github.com/saad-ali)

Source

March 8, 2019April 6, 2019

Using JFrog Artifactory as Docker Image Repository

This article is a continuation of Deploying JFrog Artifactory with Rancher. In this chapter we’ll demonstrate how to use JFrog Artifactory as a private repository for your own Docker images.

NOTE: This feature of JFrog Artifactory requires a license, but you can get a 30-day trial and use it to follow along.

Prepare GCP for the Deployment

If you plan to use Artifactory as a repository for Docker outside of your local network, you’ll need a public IP address. In the first part of this article we deployed our cluster into Google Cloud, and we’ll continue to use GCP resources now.

You can reserve a public IP by running the following command in the Google Cloud Shell or in your local environment via the gcloud command:

gcloud compute addresses create artifactory-demo –global

Use the name you chose (artifactory-demo in our case) to retrieve the address:

gcloud compute addresses describe artifactory-demo –global

Look for the address label in the output:

We’ll use the magical xip.io service from Basecamp to assign a fully-qualifed domain name to our service, which in our case will be 35.190.61.62.xip.io.

Deploy Artifactory

You can follow the steps in the previous chapter to deploy Rancher and Artifactory, but when you reach the part about configuring the variables in the app deployment page, add or change the following variables:

ingress.enabled=true
ingress.hosts[0]=35.190.61.62.xip.io
artifactory.service.type=NodePort
nginx.enabled=false
ingress.annotations.”kubernetes.io/ingress.global-static-ip-name”=artifactory-demo

(You can copy/paste that block of text into a field, and Rancher will convert it for you.)

When all is done, it should look like the image below:

Artifactory Docker Registry Configs

Click Launch to begin deploying the resources.

An Explanation of the Variables

While your new Artifactory instance spins up, let’s look at what we just configured.

ingress.enabled=true

This enables the creation of an ingress resource, which will serve as a proxy for Artifactory. In our case the Ingress will be a load balancer within GCP.

ingress.hosts[0]=35.190.61.62.xip.io

This sets the hostname for Artifactory. Part of the magic of xip.io is that we can create any subdomain and have it resolve back to the IP, so when we use docker-demo.35.190.61.62.xip.io later in this walkthrough, it will resolve to 35.190.61.62.

artifactory.service.type=NodePort

This exposes Artifactory’s service via a random port on the Kubernetes node. The Ingress resource will send traffic to this port.

nginx.enabled=false

Because we’re using the Ingress resource to talk to Artifactory via the Service resource, we want to disable the nginx proxy that Artifactory would otherwise start.

ingress.annotations…

This is the glue that ties Kubernetes to the static public IP address. We set this to the name of the address that you reserved so that the Ingress finds and uses the correct IP. We had to escape a large part of it because that’s the name of the annotation. Without escaping the elements, Kubernetes would misunderstand what we were asking it to do.

Review the Deployment

Once the deployment completes, look at the Workloads tab. There you will see two workloads. One is the application (artifactory-artifactory), and the other is the PostgreSQL database that artifactory uses (artifactory-postgresql).

Artifactory Workloads

Look at the Load Balancing tab next. There you will see the Ingress object with the hostname that we provided.

Load Balancing Ingress

If you select View/Edit YAML and scroll to the bottom, you will see the annotation that points to the address name in GCP (line 10 in the image):

Load Balancer View YAML

At the bottom of the Ingress definition you will also see that the hostname in spec.rules.host matches the IP address from status.loadBalancer.ingress.ip at the bottom.

Configure Artifactory

When you close the View/Edit YAML window, you’ll return to the Load Balancing tab. There you’ll find a link with the xip.io address. Click it to open Artifactory, or just enter the hostname into your browser.

Click through the wizard, first adding your license key and then setting an admin password. Click through the rest until the wizard completes.

In the menu on the left side, select Admin, and then under Repositories select Local.

Artifactory Admin Local Repo

There you will see the default repository created by the setup wizard. Select + New from the upper right corner to create a new repository. Choose Docker as the package type and enter a name for the repository. In our case we chose docker-demo. Leave everything else at the defaults and select Save & Finish to create the new repository.

Artifactory create docker registry

The name that you chose (docker-demo for us) becomes the subdomain for your xip.io domain. For our installation, we’ll be using docker-demo.35.190.61.62.xip.io. Yours will of course be different, but it will follow the same format.

Test the Registry

What fun is it to have a private Docker repository if you don’t use it?

For a production deployment you would secure the registry with an SSL certificate, and that would require a real hostname on a real domain. For this walkthrough, though, you can use the newly-created registry by telling Docker that it’s an insecure registry.

Create or edit daemon.json according to the documentation, adding your host like we do in the following example:

{
“insecure-registries”: [“docker-demo.35.190.61.62.xip.io:80”]
}

If you use Docker for Mac or Docker for Windows, set this in the preferences for the application:

Docker for Mac Insecure Registry

Restart Docker for it to pick up the changes, and after it restarts, you can use the registry:

docker login docker-demo.35.190.61.62.xip.io
Username: admin
Password:
Login Succeeded

To continue with the test, we’ll pull a public container image, re-tag it, and then push it to the new private registry.

Pull a Public Container Image

$ docker pull nginx
Using default tag: latest
latest: Pulling from library/nginx
f17d81b4b692: Pull complete
d5c237920c39: Pull complete
a381f92f36de: Pull complete
Digest: sha256:b73f527d86e3461fd652f62cf47e7b375196063bbbd503e853af5be16597cb2e
Status: Downloaded newer image for nginx:latest

Re-tag the Image

You can see the current image id and information on your system:

$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nginx latest dbfc48660aeb 2 days ago 109MB

Re-tag it so that Docker knows to push it to your new private registry:

$ docker tag nginx docker-demo.35.190.61.62.xip.io:80/nginx:latest

Push the Image to the Private Registry

With the image re-tagged, use docker push to send it to your private registry:

$ docker push docker-demo.35.190.61.62.xip.io:80/nginx:latest
The push refers to repository [docker-demo.35.190.61.62.xip.io:80/nginx]
86df2a1b653b: Pushed
bc5b41ec0cfa: Pushed
237472299760: Pushed
latest: digest: sha256:d98b66402922eccdbee49ef093edb2d2c5001637bd291ae0a8cd21bb4c36bebe size: 948

Verify the Push in Artifactory

Back in the Artifactory UI, select Artifacts from the menu.

Artifactory Artifacts

There you’ll see your nginx image and information about it.

Artifactory Pushed Image

Next Steps

If you have an Artifactory License and want to run a private registry, repeat this walkthrough using your own domain and an SSL certificate on the ingress. With those additional items complete, you’ll be able to use the private registry with any Docker or Kubernetes installation without having to tell the host that it has permission to talk to an insecure registry.

Cleanup

To clean up the resources that we used in this article, delete the Kubernetes cluster from Rancher and then delete the Rancher server from GCP:

gcloud compute –project=rancher-20 instances delete
rancher-instance –zone=europe-west2-c

You’ll also need to delete the public IP address reservation:

gcloud compute addresses delete artifactory-demo –global

Closing

JFrog Artifactory provides services that are at the core of a development lifecycle. You can store and retrieve almost any type of artifact that your development teams produce, and having these artifacts stored in a central, managed location makes Artifactory an important part of any IT infrastructure.

Rancher makes it easy to deploy Artifactory into a Kubernetes installation. In only a few minutes we had Artifactory up and running, and it actually took longer to configure Artifactory itself than it did to install it!

Rancher makes Kubernetes easy. Artifactory makes managing binary resources easy. Together they free you to focus on the things that matter for your business, and that freedom is what matters most.

Source

March 7, 2019March 7, 2019

What is CI/CD?

The demands of modern software development combined with complexities of deploying to varied infrastructure can make creating applications a tedious process. As applications grow in size and scope, and development teams become more distributed and diverse, the overall process required to produce and release software quickly and consistently becomes more difficult.

To address these issues, teams began exploring new strategies to automate their build, test, and release processes to help deploy new changes to production faster. This lead to the development of continuous integration and continuous delivery.

In this guide we will explain what CI/CD is and how it helps teams produce well-tested, reliable software at a faster pace. Before exploring CI/CD and its benefits in depth, we should discuss some prerequisite technologies and practices that these systems build off of.

Automated Build Processes

In software development, the build process converts code that developers produce into useable pieces of software that can be executed. For compiled languages like Go or C, this stage involves running the source code through a compiler to produce a standalone binary file. For interpreted languages like Python or PHP, there is no compilation step, but the code may still need to be frozen at a specific point in time, bundled with dependencies, and packaged for easier distribution. These processes result in an artifact that is often called a “build” or “release”.

While developers can create builds manually, this has a number of disadvantages. The shift from active development to creating a build introduces a context switch, forcing individuals to halt more productive work and focusing on the build process. Furthermore, because each developer is producing artifacts on their own, inconsistencies are also likely to arise.

To address these concerns, many teams configure automated build pipelines. These systems monitor source code repositories and automatically kick off a preconfigured build process when changes are detected. This limits the amount of human involvement and ensures that a consistent process is followed on each build.

There are many build tools designed to help you automate these steps. For example, within the Java ecosystem, the following tools are popular:

Ant: Apache’s Ant is an open source Java library. Created in 2000, Ant is the original build tool in the Java space and is still frequently used today.
Maven: Apache’s Maven is a build automation tool written primarily with Java projects in mind. Unlike Apache Ant, Maven follows the philosophy of convention over configuration, requiring configuration only for the aspects of the build process that deviate from reasonable defaults.
Gradle: Reaching version 1.0 in 2012, Gradle tries to incorporate the strengths of both Ant and Maven by incorporating Maven’s modern features without losing the flexibility provided by Ant. Build instructions are written in a dynamic language called Groovy. Despite being a newer tool in this space, it’s seen widespread adoption.

Version Control

Most modern software development requires frequent collaboration within a shared codebase. Version control systems (VCS) are employed to help maintain project history, allow work on discrete features in parallel, and resolve conflicting changes. The VCS allows projects to easily adopt changes and to roll back in case of problems. Developers can work on projects on their local machines and use the VCS to manage the different branches of development.

Every change recorded in a VCS is called a commit. Each commit catalogs the changes to the codebase and includes metadata like a description that can be helpful when reviewing the commit history or merging updates.

Fig. 1 Distributed Version Control

Fig. 1: Diagram of distributed version control

While version control is a valuable tool to help manage many different changes within a single codebase, distributed development often introduces challenges. Developing in independent branches of the codebase without regularly merging into a shared integration branch can make it difficult to incorporate changes later on. To avoid this, developers started adopting a practice called continuous integration.

Continuous Integration (CI)

Continuous Integration (CI) is a process that allows developers to integrate work into a shared branch often, enhancing collaborative development. Frequent integration helps dissolve silos, reducing the size of each commit to lower the chance of merge conflicts.

A robust ecosystem of tools have been developed to encourage CI practices. These systems integrate with VCS repositories to automatically run build scripts and test suites when new changes are detected. Integration tests ensure that different components function together as a group, allowing teams to catch compatibility bugs early. Continuous integration produces builds that are thoroughly tested and reliable.

Fig. 2 Continuous Integration process

Fig. 2: Diagram of a continuous integration process

Continuous Delivery and Continuous Deployment (CD)

Continuous delivery and continuous deployment are two strategies that build off of the foundation that continuous integration provides. Continuous delivery extends the continuous integration process by deploying builds that pass the integration test suite to a pre-production environment. This makes it straightforward to evaluate each build in a production-like environment so that developers can easily validate bug fixes or test new features without additional work. Once deployed to the staging area, additional manual and automated testing is possible.

Continuous deployment takes this approach one step further. Once a build passes automated tests in a staging environment, a continuous deployment system can automatically deploy the build to production servers. In other words, every “green build” is live and available to customers for early feedback. This enables teams to release of new features and bug fixes instantly, backed by the guarantees provided by their testing processes.

Fig. 3 Roadmap for CI/CD Flow Diagram

Fig. 3: Diagram of a typical CI/CD development flow

Advantages of CI and CD

Continuous integration, delivery, and deployment provide some clear improvements to the software development process. Some of the primary benefits are outlined below.

Fast Feedback Loop

A fast feedback loop is essential to implementing a rapid development cycle. To receive timely feedback, it is essential that software reaches the end user quickly. When properly implemented, CI/CD provides a platform to achieve this goal by making it simple to update production deployments. By requiring each change to go through rigorous testing, CI helps reduce the risks associated with each build and consequently allows teams to release valuable features to customers quickly and easily.

Increased Visibility

CI/CD is usually implemented as a pipeline of sequential steps, visible to the entire team. As a result, each team members can track the state of build in the system and identify the build responsible for any test failures. By providing insight into the current state of the codebase, it is easier to plan the best course of action. This level of transparency offers a clear answer the question, “did my commit break the build?”

Simplified Troubleshooting

Since the goal of CI is to integrate and test every change made to the codebase, it is safer to make small commits and merge them into the shared code repository early. As a result, when a bug is found, it is easier to identify the change that introduced the problem. Afterwards, depending on the magnitude of the issue, the team can choose to either roll back the change or write and commit a fix, decreasing the overall time to resolution in production.

Higher Quality Software

Automating the build and deployment processes not only shortens the development cycle. It also helps teams produce higher quality software. By ensuring that each change is well-tested and deployed to at least one pre-production environment, teams can push changes to production with confidence. This is possible only when there is good test coverage of all levels of the codebase, from unit tests to more complex system tests.

Fewer Integration Issues

Because the automated test suite runs on the builds automatically produced with every commit, it is possible to catch and fix most integration issues early. This gives developers early insight into other work currently being done that might affect their code. It tests that code written by different contributors works together from the earliest possible moment instead of later when there may be additional side effects.

More Time Focused on Development

CI/CD systems rely on automation to produce builds and move new changes through the pipeline. Because manual intervention is not required, building and testing no longer require dedicated time from the development team. Instead, developers can concentrate on making productive changes to the codebase, confident that the automated systems will notify them of any problems.

Continuous Integration and Delivery Best Practices

Now that we’ve seen some of the benefits of using CI/CD, we can discuss some guidelines to help you implement these processes successfully.

Take Responsibility for the CI/CD Pipeline

Developers are responsible for the commits they make until the changes are deployed to pre-production. This means that the developer must ensure that their code is integrated properly and can be deployed at all times. If a change is committed that breaks these requirements, it is that developer’s duty to commit a fix rapidly to avoid impacting other people’s work. Build failures should halt the pipeline and block commits not involved in fixing the failure, making it essential to address the build problems quickly.

Ensure Consistent Deployments

The deployment process should not be manual. Instead, a pipeline should automate the deployment process to ensure consistency and repeatability. This reduces the chances of pushing broken builds to production and helps avoid one-off, untested configurations that are difficult to reproduce.

Commit the Codebase to Version Control

It is important that every change is committed to version control. This helps the team audit all proposed changes and lets the team revert problematic commits easily. It can also help preserve the integrity of configuration, scripts, databases, and documentation. Without version control, it is easy to lose or mishandle configuration and code changes, especially when multiple people are contributing to the same codebase.

Make Small, Incremental Changes

A crucial point to keep in mind is that the changes should be small. Waiting to introduce changes in larger batches delays feedback from testing and makes it more difficult to identify the root cause of problems.

Good Test Coverage

Since intent of CI/CD is to reduce manual testing, there should be a good automated test coverage throughout the codebase to ensure that the software is functioning as intended. Additionally, it is important to regularly clean up redundant or out-of-date tests to avoid affecting the pipeline.

The ratio of different types of tests in the test suite should reflect the “testing pyramid” model. The majority of the tests should be unit tests since they ensure basic functionality and are quick to execute. A smaller number of integration tests should follow to guarantee that components can operate together successfully. Finally, a small number regression, UI, system, and end-to-end tests should be included towards the end of the testing cycle to ensure that the build meets all of the behavioral requirements of the project. Tools like JaCoCo for Java projects can determine how much of the codebase is covered by the testing suite.

Fig. 4 Test Pyramid

Fig. 4: Diagram of test pyramid

What’s Next?

There are many different continuous integration and delivery tools available. Some examples include Jenkins, Travis CI, GoCD, CircleCI, Gitlab CI, Codeship, and TeamCity.

Source