What metrics matter: A guide for open source projects

“Without data, you’re just a person with an opinion.”

Those are the words of W. Edwards Deming, the champion of statistical process control, who was credited as one of the inspirations for what became known as the Japanese post-war economic miracle of 1950 to 1960. Ironically, Japanese manufacturers like Toyota were far more receptive to Deming’s ideas than General Motors and Ford were.

Community management is certainly an art. It’s about mentoring. It’s about having difficult conversations with people who are hurting the community. It’s about negotiation and compromise. It’s about interacting with other communities. It’s about making connections. In the words of Red Hat’s Diane Mueller, it’s about “nurturing conversations.”

However, it’s also about metrics and data.

Some have much in common with software development projects more broadly. Others are more specific to the management of the community itself. I think of deciding what to measure and how as adhering to five principles.

1. Recognize that behaviors aren’t independent of the measurements you choose to highlight.

In 2008, Daniel Ariely published Predictably Irrational, one of a number of books written around that time that introduced behavioral psychology and behavioral economics to the general public. One memorable quote from that book is the following: “Human beings adjust behavior based on the metrics they’re held against. Anything you measure will impel a person to optimize his score on that metric. What you measure is what you’ll get. Period.”

This shouldn’t be surprising. It’s a finding that’s been repeatedly confirmed by research. It should also be familiar to just about anyone with business experience. It’s certainly not news to anyone in sales management, for example. Base sales reps’ (or their managers’) bonuses solely on revenue, and they’ll try to discount whatever it takes to maximize revenue, even if it puts margin in the toilet. Conversely, want the sales force to push a new product line—which will probably take extra effort—but skip the spiffs? Probably not happening.

And lest you think I’m unfairly picking on sales, this behavior is pervasive, all the way up to the CEO, as Ariely describes in a 2010 Harvard Business Review article: “CEOs care about stock value because that’s how we measure them. If we want to change what they care about, we should change what we measure.”

Developers and other community members are not immune.

2. You need to choose relevant metrics.

There’s a lot of folk wisdom floating around about what’s relevant and important that’s not necessarily true. My colleague Dave Neary offers an example from baseball: “In the late ’90s, the key measurements that were used to measure batter skill were RBI (runs batted in) and batting average (how often a player got on base with a hit, divided by the number of at-bats). The Oakland A’s were the first major league team to recruit based on a different measurement of player performance: on-base percentage. This measures how often they get to first base, regardless of how it happens.”

Indeed, the whole revolution of sabermetrics in baseball and elsewhere, which was popularized in Michael Lewis’ Moneyball, often gets talked about in terms of introducing data in a field that historically was more about gut feel and personal experience. But it was also about taking a game that had actually always been fairly numbers-obsessed and coming up with new metrics based on mostly existing data to better measure player value. (The data revolution going on in sports today is more about collecting much more data through video and other means than was previously available.)

3. Quantity may not lead to quality.

As a corollary, collecting lots of tangential but easy-to-capture data isn’t better than just selecting a few measurements you’ve determined are genuinely useful. In a world where online behavior can be tracked with great granularity and displayed in colorful dashboards, it’s tempting to be distracted by sheer data volume, even when it doesn’t deliver any great insight into community health and trajectory.

This may seem like an obvious point: Why measure something that isn’t relevant? In practice, metrics often get chosen because they’re easy to measure, not because they’re particularly useful. They tend to be more about inputs than outputs: The number of developers. The number of forum posts. The number of commits. Collectively, measures like this often get called vanity metrics. They’re ubiquitous, but most people involved with community management don’t think much of them.

Number of downloads may be the worst of the bunch. It’s true that, at some level, they’re an indication of interest in a project. That’s something. But it’s sufficiently distant from actively using the project, much less engaging with the project deeply, that it’s hard to view downloads as a very useful number.

Is there any harm in these vanity metrics? Yes, to the degree that you start thinking that they’re something to base action on. Probably more seriously, stakeholders like company management or industry observers can come to see them as meaningful indicators of project health.

4. Understand what measurements really mean and how they relate to each other.

Neary makes this point to caution against myopia. “In one project I worked on,” he says, ”some people were concerned about a recent spike in the number of bug reports coming in because it seemed like the project must have serious quality issues to resolve. However, when we looked at the numbers, it turned out that many of the bugs were coming in because a large company had recently started using the project. The increase in bug reports was actually a proxy for a big influx of new users, which was a good thing.”

In practice, you often have to measure through proxies. This isn’t an inherent problem, but the further you get between what you want to measure and what you’re actually measuring, the harder it is to connect the dots. It’s fine to track progress in closing bugs, writing code, and adding new features. However, those don’t necessarily correlate with how happy users are or whether the project is doing a good job of working towards its long-term objectives, whatever those may be.

5. Different measurements serve different purposes.

Some measurements may be non-obvious but useful for tracking the success of a project and community relative to internal goals. Others may be better suited for a press release or other external consumption. For example, as a community manager, you may really care about the number of meetups, mentoring sessions, and virtual briefings your community has held over the past three months. But it’s the number of contributions and contributors that are more likely to grab the headlines. You probably care about those too. But maybe not as much, depending upon your current priorities.

Still, other measurements may relate to the goals of any sponsoring organizations. The measurements most relevant for projects tied to commercial products are likely to be different from pure community efforts.

Because communities differ and goals differ, it’s not possible to simply compile a metrics checklist, but here are some ideas to think about:

Consider qualitative metrics in addition to quantitative ones. Conducting surveys and other studies can be time-consuming, especially if they’re rigorous enough to yield better-than-anecdotal data. It also requires rigor to construct studies so that they can be used to track changes over time. In other words, it’s a lot easier to measure quantitative contributor activity than it is to suss out if the community members are happier about their participation today than they were a year ago. However, given the importance of culture to the health of a community, measuring it in a systematic way can be a worthwhile exercise.

Breadth of community, including how many are unaffiliated with commercial entities, is important for many projects. The greater the breadth, the greater the potential leverage of the open source development process. It can also be instructive to see how companies and individuals are contributing. Projects can be explicitly designed to better accommodate casual contributors.

Are new contributors able to have an impact, or are they ignored? How long does it take for code contributions to get committed? How long does it take for a reported bug to be fixed or otherwise responded to? If they asked a question in a forum, did anyone answer them? In other words, are you letting contributors contribute?

Advancement within the project is also an important metric. Mikeal Rogers of the Node.js community explains: “The shift that we made was to create a support system and an education system to take a user and turn them into a contributor, first at a very low level, and educate them to bring them into the committer pool and eventually into the maintainer pool. The end result of this is that we have a wide range of skill sets. Rather than trying to attract phenomenal developers, we’re creating new phenomenal developers.”

Whatever metrics you choose, don’t forget why you made them metrics in the first place. I find a helpful question to ask is: “What am I going to do with this number?” If the answer is to just put it in a report or in a press release, that’s not a great answer. Metrics should be measurements that tell you either that you’re on the right path or that you need to take specific actions to course-correct.

For this reason, Stormy Peters, who handles community leads at Red Hat, argues for keeping it simple. She writes, “It’s much better to have one or two key metrics than to worry about all the possible metrics. You can capture all the possible metrics, but as a project, you should focus on moving one. It’s also better to have a simple metric that correlates directly to something in the real world than a metric that is a complicated formula or ration between multiple things. As project members make decisions, you want them to be able to intuitively feel whether or not it will affect the project’s key metric in the right direction.”

Source

The developer of Smith and Winston made an interesting blog post about supporting multiple platforms

I recently talked about the Steam release of Smith and Winston but I didn’t realise until late last night, that the developer actually made a very interesting blog post about supporting multiple platforms.

Interesting enough, that it warranted an extra post to talk a little about it. Why? Well, a bit of a situation started when game porter Ethan Lee made a certain Twitter post, which is a bit of a joke aimed at developers who see Linux as “Too niche” while practically falling over themselves to get their games on every other new platform that appears. This Twitter post was shared around (a lot) and some developers (like this) ended up mentioning how Linux doesn’t sell a lot of games and it continued spreading like wildfire.

There’s been a lot of counter-arguments for Linux too, like this and this and this and this and a nice one thrown our way too. Oh and we even spoke to Tommy Refenes who said the next SMB should come to Linux too, so that was awesome. Additionally, Ethan Lee also wrote up a post about packaging Linux games, worth a read if you’re new to packaging for Linux.

Where was I again? Right, the blog post from the developer of Smith and Winston about how they support Windows, Mac and Linux. They go over details about how they do so, from using SDL2 which they say “takes 90% of the pain away from platform support” to the cross-platform rendering library bgfx. It’s just a really interesting insight into how developing across multiple platforms doesn’t have to be overly difficult.

I especially liked these parts:

I’ve been writing games and engines for 30+ years so none of this is new, I have a lot of experience. But you only get the experience by doing it and not making excuses.

By forcing the game through different compilers (Visual C++, Clang and GCC) you find different code bugs (leave the warnings on max). By forcing the runtime to use different memory allocators, threading libraries and rendering technologies you find different runtime bugs. This makes your code way more stable on every platform. Even if you never deploy your code to a non windows platform just running it on Linux or macOS will expose crashes instantly that rarely appear on Windows. So delivering a quality product on the dominant platform is easier if your support the minor platforms as well.

They also clearly mention, that they might not even make their money back on the Linux port of Smith and Winston. However, they’re clear that the other reasons (code quality, easier porting to other platforms and so on) do help make up for it. This is a similar point the people from Stardock also made on Reddit.

See the post here in full. If you wish to check out their game, Smith and Winston, it’s available on itch.io and Steam in Early Access.

Source

The Future of Linux – OSnews

Linux news is getting more and more exciting, and somehow, managing to get less and less interesting. Why? Because the speed of development is getting so rapid that it’s hard to get excited for each upcoming release, keep your system completely up to date, and even remember what the current version of your favorite distributions are. This breakneck pace of development means good and bad things, but I have a few ideas about how I expect it to turn out.

The opinions in this piece are those of the author and not necessarily those of osnews.com

There are literally hundreds, if not thousands of distributions out there. In fact, with Knoppix, almost anyone can make his own. Each season, it seems we watch some distributions fold and others form. It’s getting harder and harder to tell them apart. Think you’re an expert? Answer these questions quickly:

5) Name three source-based distros.

According to a recent post on Distrowatch.com, “It is time to face the facts: the number of Linux distributions is growing at an alarming rate. On average, around 3 – 4 new distributions are submitted to this site every week, a fact that makes maintaining the individual pages and monitoring new releases increasingly time consuming. The DistroWatch database now lists a total of 205 Linux distributions (of which 24 have been officially discontinued) with 75 more on the waiting list. It is no longer easy to keep up.” Distributions change often, as does the popularity of each. Keeping up is almost impossible. Many Linux users install new distributions every few days, weeks, or months. Sadly, many of these folks keep a Windows installation – not because they prefer Windows, but because it’s a “safe haven” for their data which can’t find a permanent home on any given Linux distribution. Can this pace continue? I say no.

Predicting the future is always risky for an author, especially one who contributes to internet sites, where your words are often instantly accessible to the curious. But I’m going to put my money on the table and take some guesses about the future of Linux. Here, in no particular order, are six theories that I believe are inevitabilities. Keep in mind that although I’ve been liberal in tone, nearly everything in this piece is speculation or opinion and is subject to debate. Not all of these theories are necessarily entirely original thought, but all arguments are.

1) Major Linux distributions will collapse into a small, powerful group.
“Major players” in the Linux market, until recently, included Red Hat, SuSE, Mandrake, Debian, and Slackware. Some would argue more or less, but now you have a number of popular distros making inroads into the community, Xandros, LindowsOS, and Gentoo to name a few. Another fringe including Yoper, ELX, and TurboLinux are making plays for corporate desktops. I’m coining a new term for this era of Linux computing: distribution bloat. We have hundreds of groups offering us what is essentially minor tweaks and optimizations of a very similar base. This cannot continue at this pace. There will from this point on, be a growing number of Linux installation packages as people become more skilled, but there will be fewer distributions on a mass scale as commercial Linux stabilizes.

I think we’ll see the commercial Linux market boil down to two or three players, and this has already begun. I expect it to be a Ximian-ized Novell/SUSE distribution, Red Hat, and some sort of Debian offshoot – whether it’s User Linux or not remains to be seen. Sun’s Linux offering, Java Desktop System, will be deployed in Solaris committed companies and not much more.

2) Neither KDE nor Gnome will “win;” a third desktop environment will emerge.
The KDE/Gnome debate is a troll’s dream come true. People are often passionate about their desktop environment. I believe they both have strengths and weaknesses. However, a third DE, with a clean and usable base, will emerge in time, its sole mission to unify the Linux GUI. Only when there is true consistency of the look and feel of the desktop, or close to it, will Linux become a viable home OS for an average user. Currently, we see this consistency forged by common Qt and GKT themes, and offerings like Ximian Desktop which attempts to mask the different nature of each application. This is not about lack of choice – it is, however, about not allowing choice to supercede usability of the whole product.

Features that a desktop must include are obvious by now: cut & paste must work the same way throughout the OS, menus must be the same in all file manager windows, the same shortcut keys must apply in all applications, and all applications must have the same window borders. Many seemingly basic tasks that haven’t entirely matured, or in some cases, been accomplished at all, yet.

In any event, the DE’s importance will lessen once greater platform neutrality exists. This will doubtlessly cause many to argue that I am wrong – admittedly, it’s a tall order especially with Gnome and KDE becoming established and accomplishing so much. I maintain that unless there is some sort of merging, not a set of standards like freedesktop.org, but rather, a common base for development, that there will be a fragmented feel to Linux that simply doesn’t exist in Windows today.

3) Distribution optimization will become more prevalent
Most distributions today can be used for anything – a desktop system, a web server, a file server, a firewall, DNS, firewall, etc. I am of firm belief that Windows’ greater downfall on the server is that it has been a glorified desktop for too long. The file extensions are still hidden by default, you’re forced to run a GUI, and you can still run all your applications on the system. I predict that we’ll start to see flavors within distributions tweaked at the source level for optimization. Systems made to run as a desktop will have many different default options from their server optimized counterparts.

4) Integration will force the ultimate “killer app”
I predict an open, central authentication system will take the Linux world by storm. There still isn’t a Linux comparison to NDS/eDirectory or Active Directory that makes user management across the network as simple as either of the two. While eDirectory will run on Linux, there is no open standard with a GUI management tool that automates this mass management. An authentication service whose job is only to watch resources including users, devices, applications, and files doesn’t exist and can’t be built without serious Linux know-how. This service, which I’ll casually refer to as LCAS (Linux Central Authentication System) for lack of a better term, will be as easy to establish as a new Microsoft domain’s Active Directory.

LCAS will operate using completely open standards (X.500/LDAP) and will be easily ported to the BSDs and to commercial Unixes. Unlike Active Directory, LCAS services will be portable, and stored in a variety of databases, including PostgreSQL, MySQL, and even Oracle and DB2. LCAS, like Linux, will be pluggable, so that as it matures, management of other objects, like routers and switches, your firewall, and even workstations and PDAs and eventually, general network and local policies, will be controllable from your network LCAS installation. Perhaps, in time, it will also manage objects on the internet and how they can act within your network. I envision the ability to block, say, a particularly annoying application‘s HTTP traffic, the ability for certain users to use specified protocols, or installing internet printers via LCAS.

5) Releases will become less frequent, and updates more common
There is a competition for versioning in the Linux world, as though higher version numbers are somehow “better.” Version inflation is commonplace, with companies incrementing the major version for minor overall updates, and going from X.1 to (X+1) after a few application updates and a minor kernel increase. There is also a software trend that eventually, when the version number gets too high and is abandoned in favor of less harsh sounding names. No one would upgrade Windows every six months, so why upgrade Linux every six months? Because the software gets better too quickly! And the best way to get new software that works is to upgrade the whole distro! This is backward. The software should be incidental to the distro, not the reason for its version stamp.

Gentoo Linux just changed their release engineering guide specs to include for a year number with subsequent point releases. This, I think, is the right idea. I predict that we’ll start to see releases like DistroX 2004, DistroX 2005. As a counterpart, we’ll begin to see downloadable updates like service packs, like DistroX 2004 Update2. These updates will be easily installable and will update and patch not only the OS, but all components that came with the distro.

It is not unlikely that we’ll see a front end installer that launches, checks your system and the server, asks which pieces you want upgraded, and then processes it. There are systems like this in place today, however, they are constantly updated. Too often, people don’t patch or update, they just reinstall. We’re going to see only security updates for each distro, and approximately quarterly, we’ll see an official Update. Updates distributed in this fashion are much more likely to be applied by a common user than the slew of updates issued on an almost daily basis. Updates like this allow users to utilize a modern system much longer in between releases – for years in some cases. Unless OpenCarpet catches on, I see a service pack mentality prevailing for all commercial distributions.

6) Linux-approved hardware will become common
Part of the fight for useable Linux is with device drivers and hardware. Getting your video card to work properly, even with a binary driver available, is still way too hard. While this isn’t always the fault of the hardware, we will see, in time, Linux approved hardware. The hardware will include Linux drivers on the accompanying disk. There will be a certification process that tests hardware against a certain set of standards. Soon, a Tux badge on a PC case will be as commonplace as the “Built for Windows XX” stickers on most cases today.

I don’t claim to be visionary by any means. I also don’t want to forcefully bring spirituality into the mix, but I believe all things exist in waves, with highs and lows. Linux started small, it’s gained an audience, and as it swells to a large point, we, the community, should anticipate the future refold of things. The eventual downswing shouldn’t be an implosion, but rather, an opportunity to organize and streamline the existence of free software. It doesn’t have to be a reduction in use, it can be a simple cooperation, reduction of market saturation, and convergence towards standards.

Within the next two years, we’ll likely see Linux kernel 2.8, Gnome 3, and KDE 4. We’ll see exciting new projects. We’ll see many new Linux distributions and many existing ones disappear. We’ll see the pre-Microsoft Longhorn media blitz. And I bet, not too much longer than that, we’ll see some of the above start to become a reality as well.

Adam Scheinberg is a regular contributor to osnews.

Source

Download TurnKey Django Live CD 15.1

TurnKey Django Live CD is a free and open source software appliance, a special Debian-based operating system that has been designed from the ground up to provide users with an easy-to-use solution for deploying dedicated Django servers with minimum effort.

Django is an open source high-level Python web framework that promotes rapid application development, as well as pragmatic, clean design. The appliance comes with a pre-configured Django example project, which is installed by default in /var/www/project.

This Django project is integrated with the Apache web server using the mod_wsgi module, as well as with the MySQL database server and the Postfix mail server. In addition, it includes an administration console that features embedded online documentation.

Among other interesting components of this TurnKey appliance, we can mention the iPython command shell for interactive computing, Webmin modules for configuring the MySQL and Apache servers, as well as SSL for secure connections.

When installing this appliance, users should keep in mind that the default username for the Webmin, SSH and MySQL components is root, and that the default Django admin console username is admin. After installation, users will be able to enter new passwords for these accounts.

In order to have a fully functional Django server, you’ll also have to add a valid email address for the Django ‘admin’ account. Optionally, you can initialize the TurnKey Hub services for securely storing your files, databases and package management information.

The appliance is distributed as Live CD ISO images, allowing users to try it without installing anything on their computers. However, their main purpose is to install the operating system on a local disk drive. In addition to the Live CDs, the appliance is also available for download as virtual machine images for Xen, OVF, OpenNode, OpenVZ and OpenStack virtualization technologies.

TurnKey Linux Django LiveCD Operating system TurnKey Linux Django LiveCD

Source

Python Seaborn Tutorial – Linux Hint

In this lesson on Python Seaborn library, we will look at various aspects of this data visualisation library which we can use with Python to generate beautiful and intuitive graphs which can visualise data in a form which business wants from a platform. To make this lesson complete, we will cover the following sections:

  • What is Python Seaborn?
  • Types of Plots we can construct with Seaborn
  • Working with Multiple plots
  • Some alternatives for Python Seaborn

This looks like a lot to cover. Let us get started now.

What is Python Seaborn library?

Seaborn library is a Python package which allows us to make infographics based on statistical data. As it is made on top of matplotlib, so, it is inherently compatible with it. Additionally, it supports NumPy and Pandas data structure so that plotting can be done directly from those collections.

Visualising complex data is one of the most important thing Seaborn takes care of. If we were to compare Matplotlib to Seaborn, Seaborn is able to make those things easy which are hard to achieve with Matplotlib. However, it is important to note that Seaborn is not an alternative to Matplotlib but a complement of it. Throughout this lesson, we will make use of Matplotlib functions in the code snippets as well. You will select to work with Seaborn in the following use-cases:

  • You have statistical time series data to be plotted with representation of uncertainty around the estimates
  • To visually establish the difference between two subsets of data
  • To visualise the univariate and bivariate distributions
  • Adding much more visual affection to the matplotlib plots with many built-in themes
  • To fit and visualise machine learning models through linear regression with independent and dependent variables

Just a note before starting is that we use a virtual environment for this lesson which we made with the following command:

python -m virtualenv seaborn
source seaborn/bin/activate

Once the virtual environment is active, we can install Seaborn library within the virtual env so that examples we create next can be executed:

pip install seaborn

You can use Anaconda as well to run these examples which is easier. If you want to install it on your machine, look at the lesson which describes “How to Install Anaconda Python on Ubuntu 18.04 LTS” and share your feedback. Now, let us move forward to various types of plots which can be constructed with Python Seaborn.

To keep this lesson hands-on, We will use Pokemon dataset which can be downloaded from Kaggle. To import this dataset into our program, we will be using the Pandas library. Here are all the imports we perform in our program:

import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns

Now, we can import the dataset into our program and show some of the sample data with Pandas as:

df = pd.read_csv(‘Pokemon.csv’, index_col=0)
df.head()

Note that to run the above code snippet, the CSV dataset should be present in the same directory as the program itself. Once we run the above code snippet, we will see the following output (in Anaconda Jupyter’s notebook):

Plotting Linear Regression curve

One of the best thing about Seaborn is the intelligent plotting functions it provides which not only visualises the dataset we provide to it but also construct regression models around it. For example, it is possible to construct a linear regression plot with a single line of code. Here is how to do this:

sns.lmplot(x=‘Attack’, y=‘Defense’, data=df)

Once we run the above code snippet, we will see the following output:

We noticed few important things in the above code snippet:

  • There is dedicated plotting function available in Seaborn
  • We used Seaborn’s fitting and plotting function which provided us with a linear regression line which it modelled itself

Don’t be afraid if you thought we cannot have a plot without that regression line. We can ! Let’s try a new code snippet now, similar to the last one:

sns.lmplot(x=‘Attack’, y=‘Defense’, data=df, fit_reg=False)

This time, we will not see the regression line in our plot:

Now this is much more clear (if we do not need the linear regression line). But this isn’t just over yet. Seaborn allows us to make different this plot and that is what we will be doing.

Constructing Box Plots

One of the greatest feature in Seaborn is how it readily accepts Pandas Dataframes structure to plot data. We can simply pass a Dataframe to the Seaborn library so that it can construct a boxplot out of it:

sns.boxplot(data=df)

Once we run the above code snippet, we will see the following output:

We can remove the first reading of total as that looks a little awkward when we are actually plotting individual columns here:

stats_df = df.drop([‘Total’], axis=1)
# New boxplot using stats_df
sns.boxplot(data=stats_df)

Once we run the above code snippet, we will see the following output:

Swarm Plot with Seaborn

We can construct an intuitive design Swarm plot with Seaborn. We will again be using the dataframe from Pandas which we loaded earlier but this time, we will be calling Matplotlib’s show function to show the plot we made. Here is the code snippet:

sns.set_context(“paper”)
sns.swarmplot(x=“Attack”, y=“Defense”, data=df)
plt.show()

Once we run the above code snippet, we will see the following output:

By using a Seaborn context, we allow Seaborn to add a personal touch and fluid design for the plot. It is possible to customise this plot even further with custom font size used for labels in the plot to make the reading easier. To do this, we will be passing more parameters to the set_context function which performs just like what they sound. For example, to modify the font size of the labels, we will make use of font.size parameter. Here is the code snippet to do the modification:

sns.set_context(“paper”, font_scale=3, rc={“font.size”:8,“axes.labelsize”:5})
sns.swarmplot(x=“Attack”, y=“Defense”, data=df)
plt.show()

Once we run the above code snippet, we will see the following output:

The font size for the label was changed based on the parameters we provided and value associated to the font.size parameter. One thing Seaborn is expert at is to make the plot very intuitive for practical usage and this means that Seaborn is not just a practice Python package but actually something we can use in our production deployments.

Adding a Title to plots

It is easy to add titles to our plots. We just need to follow a simple procedure of using the Axes-level functions where we will call the set_title() function like we show in the code snippet here:

sns.set_context(“paper”, font_scale=3, rc={“font.size”:8,“axes.labelsize”:5})
my_plot = sns.swarmplot(x=“Attack”, y=“Defense”, data=df)
my_plot.set_title(“LH Swarm Plot”)
plt.show()

Once we run the above code snippet, we will see the following output:

This way, we can add much more information to our plots.

Seaborn vs Matplotlib

As we looked at the examples in this lesson, we can identify that Matplotlib and Seaborn cannot be directly compared but they can be seen as complementing each other. One of the features which takes Seaborn 1 step ahead is the way Seaborn can visualise data statistically.

To make best of Seaborn parameters, we highly recommend to look at the Seaborn documentation and find out what parameters to use to make your plot as close to business needs as possible.

Conclusion

In this lesson, we looked at various aspects of this data visualisation library which we can use with Python to generate beautiful and intuitive graphs which can visualise data in a form which business wants from a platform. The Seaborm is one of the most important visualisation library when it comes to data engineering and presenting data in most visual forms, definitely a skill we need to have under our belt as it allows us to build linear regression models.

Source

GCC vs. Clang Compiler Performance On NVIDIA Xavier’s Carmel ARMv8 Cores

Since receiving the powerful NVIDIA Jetson AGX Xavier with its ARMv8 Carmel cores on this Tegra194 SoC a while back, it’s been quite a fun developer board for benchmarking and various Linux tests. One of the areas I was curious about was whether GCC or Clang would generate faster code for this high performance ARM SoC, so here are some benchmarks.

 

 

This CPU compiler benchmarking was done with the NVIDIA Jetson AGX Xavier while running the Ubuntu 18.04 LTS default L4T file-system and comparing the default GCC 7.3.0 against LLVM Clang 6.0 compiler options as officially supported by Ubuntu LTS Bionic Beaver. These are also the compiler versions supported by NVIDIA with their Tegra software on this Linux 4 Tegra sample file-system. The NVIDIA Tegra Xavier (T194) SoC as a reminder has eight “Carmel” ARMv8 CPU cores that are custom designed by NVIDIA. Tests on other more common ARMv8 cores with these different compilers will be coming up in future Phoronix articles with Clang 8 and GCC 9 releasing later this quarter. Rounding out this powerful Jetson AGX Xavier is the Volta GPU with 512 CUDA cores, 16GB of LPDDR4 system memory, 32GB of eMMC storage, two NVDLA deep learning accelerators, and a 7-way vision processor, granted those aren’t the focus of today’s testing.

Via the Phoronix Test Suite a wide range of C/C++ benchmarks were carried out on this platform for seeing how the GCC and Clang compilers compare from Ubuntu 18.04 LTS.

Source

Introduction to Linux Server Security Hardening

Securing your Linux server(s) is a difficult and time consuming task for System Administrators but its necessary to harden the server’s security to keep it safe from Attackers and Black Hat Hackers. You can secure your server by configuring the system properly and installing as minimum softwares as possible. There are some tips which can help you secure your server from network and privilege escalation attacks.

Upgrade your Kernel

Outdated kernel is always prone to several network and privilege escalation attacks. So you can update your kernel using apt in Debian or yum in Fedora.

sudo apt-get update
sudo apt-get dist-upgrade

Disabling Root Cron Jobs

Cron jobs running by root or high privilege account can be used as a way to gain high privileges by attackers. You can see running cron jobs by

ls /etc/cron*

Strict Firewall Rules

You should block any unnecessary inbound or outbound connection on uncommon ports. You can update your firewalls rules by using iptables. Iptables is a very flexible and easy to use utility used to block or allow incoming or outgoing traffic. To install, write

sudo apt-get install iptables

Here’s an example to block incoming on FTP port using iptables

iptables -A INPUT -p tcp –dport ftp -j DROP

Disable unnecessary Services

Stop any unwanted services and daemons running on your system. You can list running services using following commands.

ubuntu@ubuntu:~$ service –status-all

[ + ]  acpid
[ – ]  alsa-utils
[ – ]  anacron
[ + ]  apache-htcacheclean
[ + ]  apache2
[ + ]  apparmor
[ + ]  apport
[ + ]  avahi-daemon
[ + ]  binfmt-support
[ + ]  bluetooth
[ – ]  cgroupfs-mount

…snip…

OR using the following command

chkconfig –list | grep ‘3:on’

To stop a service, type

sudo service [SERVICE_NAME] stop

OR

sudo systemctl stop [SERVICE_NAME]

Check for Backdoors and Rootkits

Utilities like rkhunter and chkrootkit can be used to detect known and unknown backdoors and rootkits. They verify installed packages and configurations to verify system’s security. To install write,

ubuntu@ubuntu:~$ sudo apt-get install rkhunter -y

To scan your system, type

ubuntu@ubuntu:~$ sudo rkhunter –check

[ Rootkit Hunter version 1.4.6 ]

Checking system commands…

Performing ‘strings’ command checks
Checking ‘strings’ command                           [ OK ]

Performing ‘shared libraries’ checks
Checking for preloading variables                    [ None found ]
Checking for preloaded libraries                     [ None found ]
Checking LD_LIBRARY_PATH variable                    [ Not found ]

Performing file properties checks
Checking for prerequisites                           [ OK ]
/usr/sbin/adduser                                    [ OK ]
/usr/sbin/chroot                                      [ OK ]

…snip…

Check Listening Ports

You should check for listening ports that aren’t used and disable them. To check for open ports, write.

azad@ubuntu:~$ sudo netstat -ulpnt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address      Foreign Address   State      PID/Program name

tcp        0    0 127.0.0.1:6379        0.0.0.0:*        LISTEN     2136/redis-server 1

tcp        0    0 0.0.0.0:111           0.0.0.0:*        LISTEN     1273/rpcbind

tcp        0    0 127.0.0.1:5939        0.0.0.0:*        LISTEN     2989/teamviewerd

tcp        0    0 127.0.0.53:53         0.0.0.0:*        LISTEN     1287/systemd-resolv

tcp        0    0 0.0.0.0:22            0.0.0.0:*        LISTEN     1939/sshd

tcp        0    0 127.0.0.1:631         0.0.0.0:*        LISTEN     20042/cupsd

tcp        0    0 127.0.0.1:5432        0.0.0.0:*        LISTEN     1887/postgres

tcp        0    0 0.0.0.0:25            0.0.0.0:*        LISTEN     31259/master

…snip…

Use an IDS (Intrusion Testing System)

Use an IDS to check network logs and to prevent any malicious activities. There’s an open source IDS Snort available for Linux. You can install it by,

wget https://www.snort.org/downloads/snort/daq-2.0.6.tar.gz
wget https://www.snort.org/downloads/snort/snort-2.9.12.tar.gz
tar xvzf daq-2.0.6.tar.gz
cd daq-2.0.6
$ ./configure && make && sudo make install
tar xvzf snort-2.9.12.tar.gz
cd snort-2.9.12
$ ./configure –enable-sourcefire && make && sudo make install

To monitor network traffic, type

ubuntu@ubuntu:~$ sudo snort

Running in packet dump mode
== Initializing Snort ==–

Initializing Output Plugins!
pcap DAQ configured to passive.

Acquiring network traffic from “tun0”.
Decoding Raw IP4

== Initialization Complete ==–

…snip…

Disable Logging as Root

Root acts as a user with full privileges, it has power to do anything with the system. Instead, you should enforce using sudo to run administrative commands.

Remove no owner Files

Files owned by no user or group can be security threat. You should search for these files and remove them or assign them a proper user a group. To search for these files, type

find /dir -xdev \( -nouser -o -nogroup \) -print

Use SSH and sFTP

For file transferring and remote administration, use SSH and sFTP instead of telnet and other insecure, open and unencrypted protocols. To install, type

sudo apt-get install vsftpd -y
sudo apt-get install openssh-server -y

Monitor Logs

Install and setup a log analyzer utility to check system logs and event data regularly to prevent any suspicious activity. Type

sudo apt-get install -y loganalyzer

Uninstall unused Softwares

Install softwares as minimum as possible to maintain small attack surface. The more softwares you have, the more chances of attacks you have. So remove any unneeded software from your system. To see installed packages, write

dpkg –list
dpkg –info
apt-get list [PACKAGE_NAME]

 

To remove a package:

sudo apt-get remove [PACKAGE_NAME] -y
sudo apt-get clean

Conlusion

Linux server security hardening is very important for enterprises and businesses. Its a difficult and tiresome task for System Administrators. Some processes can be automated by some automated utilities like SELinux and other similar softwares. Also, keeping minimus softwares and disabling unused services and ports reduces the attack surface.

Source

OpenStack Deployment using Devstack on CentOS 7 and RHEL7

Devstack is a collection of scripts which deploy the latest version of openstack environment on virtual machine, personal laptop or a desktop. As the name suggests it is used for development environment and can be used for Openstack Project’s functional testing and sometime openstack environment deployed by devstack can also be used for demonstrations purpose and for some basic PoC.

In this article I will demonstrate how to install Openstack on CentOS 7 / RHEL 7 System using Devstack. Following are the minimum system requirements,

  • Dual Core Processor
  • Minimum 8 GB RAM
  • 60 GB Hard Disk
  • Internet Connection

Following are  the details of my Lab Setup for Openstack deployment using devstack

  • Minimal Installed CentOS 7 / RHEL 7 (VM)
  • Hostname – devstack-linuxtechi
  • IP Address – 169.144.104.230
  • 10 vCPU
  • 14 GB RAM
  • 60 GB Hard disk

Let’s start deployment steps, login to your CentOS 7 or RHEL 7 System

Step:1 Update Your System and Set Hostname

Run the following yum command to apply latest updates to system and then take a reboot. Also after reboot set the hostname

~]# yum update -y && reboot
~]# hostnamectl set-hostname "devstack-linuxtechi"
~]# exec bash

Step:2) Create a Stack user and assign sudo rights to it

All the installations steps are to be carried out by a user name “stack“, refer the below commands to create and assign sudo rights .

[root@devstack-linuxtechi ~]# useradd -s /bin/bash -d /opt/stack -m stack
[root@devstack-linuxtechi ~]# echo "stack ALL=(ALL) NOPASSWD: ALL" | sudo tee /etc/sudoers.d/stack
stack ALL=(ALL) NOPASSWD: ALL
[root@devstack-linuxtechi ~]#

Step:3) Install git and download devstack

Switch to stack user and install git package using yum command

[root@devstack-linuxtechi ~]# su - stack
[stack@devstack-linuxtechi ~]$ sudo yum install git -y

Download devstack using below git command,

[stack@devstack-linuxtechi ~]$ git clone https://git.openstack.org/openstack-dev/devstack
Cloning into 'devstack'...
remote: Counting objects: 42729, done.
remote: Compressing objects: 100% (21438/21438), done.
remote: Total 42729 (delta 30283), reused 32549 (delta 20625)
Receiving objects: 100% (42729/42729), 8.93 MiB | 3.77 MiB/s, done.
Resolving deltas: 100% (30283/30283), done.
[stack@devstack-linuxtechi ~]$

Step:4) Create local.conf file and start openstack installation

To start openstack installation using devstack script (stack.sh), first we need to prepare local.conf file that suits to our setup.

Change to devstack folder and create local.conf file with below contents

[stack@devstack-linuxtechi ~]$ cd devstack/
[stack@devstack-linuxtechi devstack]$ vi local.conf
[[local|localrc]]
#Specify the IP Address of your VM / Server in front of HOST_IP Parameter
HOST_IP=169.144.104.230

#Specify the name of interface of your Server or VM in front of FLAT_INTERFACE
FLAT_INTERFACE=eth0

#Specify the Tenants Private Network and its Size
FIXED_RANGE=10.4.128.0/20
FIXED_NETWORK_SIZE=4096

#Specify the range of external IPs that will be used in Openstack for floating IPs
FLOATING_RANGE=172.24.10.0/24

#Number Host on which Openstack will be deployed
MULTI_HOST=1

#Installation Logs file
LOGFILE=/opt/stack/logs/stack.sh.log

#KeyStone Admin Password / Database / RabbitMQ / Service Password
ADMIN_PASSWORD=openstack
DATABASE_PASSWORD=db-secret
RABBIT_PASSWORD=rb-secret
SERVICE_PASSWORD=sr-secret

#Additionally installing Heat Service
enable_plugin heat https://git.openstack.org/openstack/heat master
enable_service h-eng h-api h-api-cfn h-api-cw

Save and exit the file.

Now start the deployment or installation by executing the script (stack.sh)

[stack@devstack-linuxtechi devstack]$ ./stack.sh

It will take between 30 to 45 minutes depending upon your internet connection.

While running the above command, if you got the below errors

+functions-common:git_timed:607            timeout -s SIGINT 0 git clone git://git.openstack.org/openstack/requirements.git /opt/stack/requirements --branch master
fatal: unable to connect to git.openstack.org:
git.openstack.org[0: 104.130.246.85]: errno=Connection timed out
git.openstack.org[1: 2001:4800:7819:103:be76:4eff:fe04:77e6]: errno=Network is unreachable
Cloning into '/opt/stack/requirements'...
+functions-common:git_timed:610            [[ 128 -ne 124 ]]
+functions-common:git_timed:611            die 611 'git call failed: [git clone' git://git.openstack.org/openstack/requirements.git /opt/stack/requirements --branch 'master]'
+functions-common:die:195                  local exitcode=0
[Call Trace]
./stack.sh:758:git_clone
/opt/stack/devstack/functions-common:547:git_timed
/opt/stack/devstack/functions-common:611:die
[ERROR] /opt/stack/devstack/functions-common:611 git call failed: [git clone git://git.openstack.org/openstack/requirements.git /opt/stack/requirements --branch master]
Error on exit
/bin/sh: brctl: command not found
[stack@devstack-linuxtechi devstack]$

To Resolve these errors, perform the following steps

Install bridge-utils package and change parameter from “GIT_BASE=${GIT_BASE:-git://git.openstack.org}” to “GIT_BASE=${GIT_BASE:-https://www.github.com}” in stackrc file

[stack@devstack-linuxtechi devstack]$ sudo yum install bridge-utils -y
[stack@devstack-linuxtechi devstack]$ vi stackrc
……
#GIT_BASE=${GIT_BASE:-git://git.openstack.org}
GIT_BASE=${GIT_BASE:-https://www.github.com}
……

Now re-run the stack.sh script,

[stack@devstack-linuxtechi devstack]$ ./stack.sh

Once the script is executed successfully, we will get the output something like below,

Stack-Command-output-CentOS7

This confirms that openstack has been deployed successfully,

Step:5 Access OpenStack either via Openstack CLI or Horizon Dashboard

if you want to perform any task from openstack cli, then you have to firsr source openrc file, which contain admin credentials.

[stack@devstack-linuxtechi devstack]$ source openrc
WARNING: setting legacy OS_TENANT_NAME to support cli tools.
[stack@devstack-linuxtechi devstack]$ openstack network list
+--------------------------------------+---------+----------------------------------------------------------------------------+
| ID                                   | Name    | Subnets                                                                    |
+--------------------------------------+---------+----------------------------------------------------------------------------+
| 5ae5a9e3-01ac-4cd2-86e3-83d079753457 | private | 9caa54cc-f5a4-4763-a79e-6927999db1a1, a5028df6-4208-45f3-8044-a7476c6cf3e7 |
| f9354f80-4d38-42fc-a51e-d3e6386b0c4c | public  | 0202c2f3-f6fd-4eae-8aa6-9bd784f7b27d, 18050a8c-41e5-4bae-8ab8-b500bc694f0c |
+--------------------------------------+---------+----------------------------------------------------------------------------+
[stack@devstack-linuxtechi devstack]$ openstack image list
+--------------------------------------+--------------------------+--------+
| ID                                   | Name                     | Status |
+--------------------------------------+--------------------------+--------+
| 5197ed8e-39d2-4eca-b36a-d38381b57adc | cirros-0.3.6-x86_64-disk | active |
+--------------------------------------+--------------------------+--------+
[stack@devstack-linuxtechi devstack]$

Now Try accessing the Horizon Dashboard, URL details and Credentials are already there in stack command output.

http://{Your-Server-IP-Address}/dashboard

Login-OpenStack-Dashboard-DevStack-CentOS7

Horizon-Dashboard-DevStack-CentOS7

Remove/ Uninstall OpenStack using devstack scripts

If are done with testing and demonstration and want to remove openstack from your system then run the followings scripts via Stack user,

[stack@devstack-linuxtechi ~]$ cd devstack
[stack@devstack-linuxtechi devstack]$ ./clean.sh
[stack@devstack-linuxtechi devstack]$ ./unstack.sh
[stack@devstack-linuxtechi devstack]$ rm -rf /opt/stack/
[stack@devstack-linuxtechi ~]$ sudo rm -rf devstack
[stack@devstack-linuxtechi ~]$ sudo rm -rf /usr/local/bin/

That’s all from this tutorial, if you like the steps, please do share your valuable feedback and comments.

Source

C++ in the Linux kernel? – OSnews

OOP doesn’t imply using function pointers.

The essence of OOP is polymorphism, which you can achieve in C through function pointers.

What about just having a plain structure and associate functions to it ? This is OO. In C, it forces you to adopt conventions like prefixing all your function names with the class name,…

I assume you’re refering to encapsulation, and I don’t see having to adopt a convention of prefixing your function with a object name (you’re not forced to) is a big deal unless you hunt and peck.

…and explicitely dereference all the member variables whenever you access them.

I assume you’re referring to having to pass in a pointer to structure and then using -> to dereference within a function. I don’t see how that’s a problem.

What about access rights ? There’s no way in C to forbid access to certain members of a structure. You have to document them in some way, by adding a comment next to the mmbers you consider private. And its not enforced by the compiler, thus errors can creep up.

Programmers should be using the public API. having public, protected, and private is no silver bullet.

Yes, programming using interface is useful, but there are lots of cases where objects with zero runtime overhead, but that implicitely do various housekeeping stuff, are very useful. Or even, classes that are there only to keep your code from getting full of long and messy functions, but amount to nothing at runtime.

Interfaces with virtual functions have a runtime cost, and these functions can’t be inlined. They’re not a silver bullet.

You don’t pay for what you don’t use in C also…unlike Java where you’re relying on the runtime to be smart about all methods being virtual.

And what about error management anyway ? You have to think of freeing resources everywhere, and C has no mechanism whatsoever to help you to do it. I don’t think it’s much better than an exception system, even if it has not so hard to avoid pitfalls.

Exceptions are useful, even though it has been pointed out that C++ exceptions have some major warts.

I understand your point, but I don’t think it’s really C++ fault. It’s more that since it allow to do these things more easily, it’s tempting to over-complicate stuff. I myself prefer straightforward code where I know what each classes actually do, and not have a vast amount of layers of abstraction if I don’t need them.

Yes, you can do everything in C that you do in C++, like object programming and stuff.

The purpose of C++ isn’t to let you do things that can’t be done in C, it’s to simplify your life. Indeed, it can complicate it instead, but not if used properly.

Pretty much agree on these points. Shallow hierarchies and favoring composition are the way to go.

I’ve written lots of C++ code in my lifetime (more than any other language). I think I somewhat grok proper OO. I’ve got Stroustrup’s 3rd edition I’ve got the GOF Design Patterns book, and think patterns are useful.

I guess my biggest problem with C++ is the syntax, the overcomplification of the language that seems to stem from it’s legacy of needing to be C-backward compatible, can’t stand C++ streams, and think much of the standard library API is bad.

Source

Linux Today – 5 Useful Ways to Do Arithmetic in Linux Terminal

In this article, we will show you various useful ways of doing arithmetic’s in the Linux terminal. By the end of this article, you will learn basic different practical ways of doing mathematical calculations in the command line.

Let’s get started!

1. Using Bash Shell

The first and easiest way do basic math on the Linux CLI is a using double parenthesis. Here are some examples where we use values stored in variables:

$ ADD=$(( 1 + 2 ))
$ echo $ADD
$ MUL=$(( $ADD * 5 ))
$ echo $MUL
$ SUB=$(( $MUL - 5 ))
$ echo $SUB
$ DIV=$(( $SUB / 2 ))
$ echo $DIV
$ MOD=$(( $DIV % 2 ))
$ echo $MOD
Arithmetic in Linux Bash Shell

Arithmetic in Linux Bash Shell

2. Using expr Command

The expr command evaluates expressions and prints the value of provided expression to standard output. We will look at different ways of using expr for doing simple math, making comparison, incrementing the value of a variable and finding the length of a string.

The following are some examples of doing simple calculations using the expr command. Note that many operators need to be escaped or quoted for shells, for instance the * operator (we will look at more under comparison of expressions).

$ expr 3 + 5
$ expr 15 % 3
$ expr 5 \* 3
$ expr 5 – 3
$ expr 20 / 4
Basic Arithmetic Using expr Command in Linux

Basic Arithmetic Using expr Command in Linux

Next, we will cover how to make comparisons. When an expression evaluates to false, expr will print a value of 0, otherwise it prints 1.

Let’s look at some examples:

$ expr 5 = 3
$ expr 5 = 5
$ expr 8 != 5
$ expr 8 \> 5
$ expr 8 \< 5
$ expr 8 \<= 5
Comparing Arithmetic Expressions in Linux

Comparing Arithmetic Expressions in Linux

You can also use the expr command to increment the value of a variable. Take a look at the following example (in the same way, you can also decrease the value of a variable).

$ NUM=$(( 1 + 2))
$ echo $NUM
$ NUM=$(expr $NUM + 2)
$ echo $NUM
Increment Value of a Variable

Increment Value of a Variable

Let’s also look at how to find the length of a string using:

$ expr length "This is Tecmint.com"
Find Length of a String

Find Length of a String

For more information especially on the meaning of the above operators, see the expr man page:

$ man expr

3. Using bc Command

bc (Basic Calculator) is a command-line utility that provides all features you expect from a simple scientific or financial calculator. It is specifically useful for doing floating point math.

If bc command not installed, you can install it using:

$ sudo apt install bc   #Debian/Ubuntu
$ sudo yum install bc   #RHEL/CentOS
$ sudo dnf install bc   #Fedora 22+

Once installed, you can run it in interactive mode or non-interactively by passing arguments to it – we will look at both case. To run it interactively, type the command bc on command prompt and start doing some math, as shown.

$ bc 
Start bc in Non-Interactive Mode

Start bc in Non-Interactive Mode

The following examples show how to use bc non-interactively on the command-line.

$ echo '3+5' | bc
$ echo '15 % 2' | bc
$ echo '15 / 2' | bc
$ echo '(6 * 2) - 5' | bc
Do Math Using bc in Linux

Do Math Using bc in Linux

The -l flag is used to the default scale (digits after the decimal point) to 20, for example:

$ echo '12/5 | bc'
$ echo '12/5 | bc -l'
Do Math with Floating Numbers

Do Math with Floating Numbers

4. Using Awk Command

Awk is one of the most prominent text-processing programs in GNU/Linux. It supports the addition, subtraction, multiplication, division, and modulus arithmetic operators. It is also useful for doing floating point math.

You can use it to do basic math as shown.

$ awk 'BEGIN { a = 6; b = 2; print "(a + b) = ", (a + b) }'
$ awk 'BEGIN { a = 6; b = 2; print "(a - b) = ", (a - b) }'
$ awk 'BEGIN { a = 6; b = 2; print "(a *  b) = ", (a * b) }'
$ awk 'BEGIN { a = 6; b = 2; print "(a / b) = ", (a / b) }'
$ awk 'BEGIN { a = 6; b = 2; print "(a % b) = ", (a % b) }'
Do Basic Math Using Awk Command

Do Basic Math Using Awk Command

If you are new to Awk, we have a complete series of guides to get you started with learning it: Learn Awk Text Processing Tool.

5. Using factor Command

The factor command is use to decompose an integer into prime factors. For example:

$ factor 10
$ factor 127
$ factor 222
$ factor 110  
Factor a Number in Linux

Factor a Number in Linux

That’s all! In this article, we have explained various useful ways of doing arithmetic’s in the Linux terminal.

Source

WP2Social Auto Publish Powered By : XYZScripts.com