AWS Systems Manager Now Supports Multi-Account and Multi-Region Inventory View
Posted On: Nov 15, 2018
AWS Systems Manager, which provides information about your instances and the software installed on them, now supports a multi-account, multi-Region view. With this enhancement, you can simplify your workflow by centrally viewing, storing, and exporting inventory data across your accounts from a single console.
From the Systems Manager console, you can further customize the data displayed by using a pre-defined set of queries. From the same screen, you can easily download all of your data as a CSV file to generate reports on fleet patch compliance, installed applications, or network configuration. Additionally, integration with AWS Glue and Amazon Athena allows you to consolidate inventory data from multiple accounts and regions.
Shotcut Video Editor Adds VA-API Encoding Support For Linux, Other Improvements
Shotcut, a free and open source video editor, was updated to version 18.11.13 yesterday. The new release includes VA-API encoding support on Linux, as well as a new option to use hardware encoder in the export screen, among other improvements.
Shotcut is a free video editor for Linux, macOS and Windows. It includes a wide range of functions, from editing features like trimming, cutting, copying and pasting, to video effects or audio features like peak meter, loudness, waveform, volume control, audio filters, and so on.
There’s much more that Shotcut can do, including edit 4K videos, capture audio, it supports network streaming, and so on. See its features page for a in-depth list.
The application, which uses Qt5 and makes use of the MLT Multimedia Framework, supports a wide range of formats thanks to FFmpeg, and it features an intuitive interface with multiple dockable panels.
The latest Shotcut 18.11.13 adds VA-API encoding support for Linux (H.264/AVC and H.265/HEVC codecs). To enable this, you can use a newly added Use hardware encoder checkbox from the Export Video panel, then click Configure and select h264_vaapi or hevc_vaapi:
Another change in this version of Shotcut is the
addition of a New Project / Recent Projects screen
, that’s displayed when creating a new project (
File > New
):
The update also brings a
simple / advanced export mode
. When you export a video (
File > Export Video
), you’ll now see a simplified panel which lets you enable and configure the use of a hardware encoder, a message that explains the defaults, which are suitable for most users and purposes, as well as the presets. A new Advanced button was added at the bottom, which lets users specify video settings like resolution, frame rate, codecs, and so on.
Other changes worth mentioning in Shotcut 18.11.13 include:
- Added 10 and 20 Pixel Grid options to the player grid button menu
- Added View > Scopes > Video Waveform
- Added Settings > Video Mode > Non-Broadcast > Square 1080p 30 fps and 60 fps
- Added Ut Video export presets
- Added Spot Remover video filter
- Increased Scale maximum to 500% for Rotate and Scale filter
- Made GPU Effects hidden and discouraged
- macOS: added videotoolbox encoding, signed app bundle and fixed support for macOS 10.10 and 10.11
- Fixed issues like hanging on exit, crash when undoing split and transition on Timeline, etc.
You can see a complete list of changes here.
Download Shotcut video editor
On the Shotcut download page you’ll find macOS, Linux and Windows binaries. For Linux there are official AppImage and portable tar binaries, as well as links to the Shotcut
Flathub and Snapcraft pages (from where you can install the app as a Flatpak or Snap package). The Flatpak package has not yet been updated to the latest Shotcut 18.11.13 though.
Other video editing software articles on Linux Uprising:
Getting Started with Scilab | Linux Journal
Introducing one of the larger scientific lab packages for Linux.
Scilab
is meant to be an overall package for numerical science, along the
lines of Maple, Matlab or Mathematica. Although a lot of built-in
functionality exists for all sorts of scientific computations, Scilab
also includes its own programming language, which allows you to use that functionality
to its utmost. If you prefer, you instead can use this language to extend
Scilab’s functionality into completely new areas of research. Some of
the functionality includes 2D and 3D visualization and optimization tools,
as well as statistical functions. Also included in Scilab is Xcos, an
editor for
designing dynamical systems models.
Several options exist for installing Scilab on your system. Most package
management systems should have one or more packages available for
Scilab, which also will install several support packages. Or, you
simply can download and install a tarball that contains
everything you need to be able to run Scilab on your system.
Once
it’s installed, start the GUI version of Scilab with
the scilab command. If you installed Scilab via tarball, this command will
be located in the bin subdirectory where you unpacked the tarball.
When
it first starts, you should see a full workspace created for your
project.
Figure 1. When you first start Scilab, you’ll see an empty
workspace ready for you to start a new project.
On the left-hand side is a file browser where you can see data
files and Scilab scripts. The right-hand side has several
panes. The top pane is a variable browser, where you can see what
currently exists within the workspace. The middle pane contains a
list of commands within that workspace, and the bottom pane has
a news feed of Scilab-related news. The center of the workspace is the
actual Scilab console where you can interact with the execution engine.
Let’s start with some basic mathematics—for example,
division:
–> 23/7
ans =
3.2857143
As you can see, the command prompt is –>, where you enter the
next command to the execution engine. In the variable browser, you
can see a new variable named ans that contains the results of the
calculation.
Along with basic arithmetic, there is also a number of built-in functions. One thing to be aware of is that these function names are
case-sensitive. For example, the statement sqrt(9) gives the answer
of 3, whereas the statement SQRT(9) returns an error.
There
also are built-in constants for numbers like e or pi. You can use them
in statements, like this command to find the sine of pi/2:
–> sin(%pi / 2)
ans =
1.
If you don’t remember exactly what a function name is, but you remember how
it starts, you can use the tab-completion functionality in the Scilab
console. For example, you can see what functions start with “fa” by
typing those two letters and then pressing the tab key.
Figure 2. Use tab-completion to avoid typos while typing
commands in the Scilab console.
You can assign variables with the “=” symbol. For example,
assign your age to the age variable with:
–> age = 47
age =
47.
You then can access this variable directly:
–> age
age =
47.
The variable also will be visible in the variable browser pane. Accessing
variables this way basically executes the variable, which is also why you
can
get extra output. If you want to see only the value, use
the disp() function, which provides output like the following:
–> disp(age)
47.
Before moving onto more complex ideas, you’ll need to move out of the
console. The advantage of the console is that statements are executed
immediately. But, that’s also its disadvantage. To write
larger pieces of code, you’ll want to use the included editor. Click
the Applications→SciNotes menu item to open a new window where
you can enter larger programs.
Figure 3. The SciNotes application lets you write larger programs
and then run them within Scilab as a single unit.
Once you’ve finished writing your code, you can run it either by clicking
the run icon on the toolbar or selecting one of the options under the
Execute menu item. When you do this, SciNotes will ask you to save
your code to a file, with the file ending “.sce”, before running. Then,
it gets the console to run this file with the following command:
exec(‘/home/jbernard/temp/scilab-6.0.1/bin/test1.sce’, -1)
If you create or receive a Scilab file outside of Scilab, you can run it
yourself using a similar command.
To build more complex calculations, you also need a way to
make comparisons and loop over several calculations. Comparisons
can be done with either:
if …. then
stmts
end
or:
if …. then
stmts
else
stmts
end
or:
if …. then
stmts
elseif …. then
stmts
else
stmts
end
As you can see, the if and elseif lines need to
end with then. You can
have as many elseif sections as you need for your particular case. Also,
note that the entire comparison block needs to end with the
end statement.
There
also are two types of looping commands: for loops and
while loops. As
an example, you could use the following to find the square roots
of the first 100 numbers:
for i=1:100
a = sqrt(i) disp(a)
end
The for loop takes a sequence of numbers, defined by
start:end,
and each value is iteratively assigned to the dummy variable i. Then
you have your code block within the for loop and close it with the
statement end.
The while loop is similar, except it uses a comparison
statement to decide when to exit the loop.
The last quick item I want to cover is the graphing functionality
available within Scilab. You can create both 2D and 3D graphs,
and you can plot data files or the results of
functions. For example, the following plots the sine function
from 0 to pi*4:
t = linspace(0, 4 * %pi, 100) plot(t, sin(t))
Figure 4. Calling the plot function opens a new viewing
window where you can see the generated graphs.
You can use the linspace command to generate the list of values over
which the function will be executed. The plot function opens a new
window to display the resultant graph. Use the commands under
the Edit menu item to change the plot’s details before saving the
results to an image file.
You can do 3D graphs just as simply. The
following plots a parametric curve over 0 to 4*pi:
t=linspace(0,4*%pi,100); param3d(cos(t),sin(t),t)
This also opens a new plotting window to display the results. If
the default view isn’t appropriate, click
Tools→2D/3D Rotation, and with this selected, right-click
on the graph and rotate it around for a better view of
the result.
Scilab is a very powerful tool for many types of
computational science. Since it’s available on Linux, macOS and
Windows, it’s a great option if you’re collaborating with other
people across multiple operating systems. It might also prove to be a
effective tool to use in teaching environments, giving students
access to a powerful computational platform for no cost, no matter
what type of computer they are using. I hope this short article has
provided some ideas of how it might be useful to you. I’ve
barely covered the many capabilities available
with Scilab, so be sure to visit the main
website for a number of good tutorials.
AWS Elemental MediaPackage Extends Live Channel Archive Window to 14 Days
Posted On: Nov 15, 2018
The archive of content available for an AWS Elemental MediaPackage live channel is increasing from three to 14 days. This means you can access up to 14 days of live stream content for start-over, catch-up, and other DVR-like features. This larger archive is included in the same pay-as-you-go pricing, so there is no additional cost or price increase.
With MediaPackage, you can reduce workflow complexity, increase origin resiliency, and better protect multiscreen content without the risk of under or over-provisioning infrastructure. To learn more, please visit aws.amazon.com/mediapackage/.
AWS Elemental MediaPackage is available in the US East (N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Seoul), Asia Pacific (Tokyo), EU (Ireland), EU (Frankfurt), EU (Paris), and South America (São Paulo) regions.
MediaPackage functions independently or as part of AWS Elemental Media Services, a family of services that form the foundation of cloud-based video workflows and offer the capabilities you need to create, package, and deliver video.
BASH command output to the variable
Different types of bash commands need to be run from the terminal based on the user’s requirements. When the user runs any command from the terminal then it shows the output if no error exists otherwise it shows the error message. Sometimes, the output of the command needs to be stored in a variable for future use. Shell command substitution feature of bash can be used for this purpose. How you can store different types of shell commands into the variable using this feature is shown in this tutorial.
variable=$(command)
variable=$(command [option…] argument1 arguments2 …)
variable=$(/path/to/command)
OR
variable=`command`
variable=`command [option…] argument1 arguments2 …`
variable=`/path/to/command`
***Note: Don’t use any space before and after the equal sign when using the above commands.
Single command output to a variable
Bash commands can be used without any option and argument for those commands where these parts are optional. The following two examples show the uses of simple command substitution.
Example#1:
bash `date` command is used to show the current date and time. The following script will store the output of `date` command into $current_date variable by using command substitution.
$ current_date=$(date)
$ echo “Today is $current_date”
Output:
Example#2:
`pwd` command shows the path of the current working directory. The following script stores the output of `pwd` command into the variable, $current_dir and the value of this variable is printed by using `echo` command.
$ current_dir=`pwd`
$ echo “The current directory is : $current_dir”
Output:
Command with option and argument
The option and argument are mandatory for some bash commands. The following examples show how you can store the output of the command with option and argument into a variable.
Example#3:
Bash `wc` command is used to count the total number of lines, words, and characters of any file. This command uses -c, -w and -l as option and filename as the argument to generate the output. Create a text file named fruits.txt with the following data to test the next script.
fruits.txt
fruits.txt
Mango
Orange
Banana
Grape
Guava
Apple
Run the following commands to count and store the total number of words in the fruits.txt file into a variable, $count_words and print the value by using `echo` command.
$ count_words=`wc -w fruits.txt`
$ echo “Total words in fruits.txt is $count_words”
Output:
Example#4:
`cut` is another bash command that uses option and argument to generate the output. Create a text file named weekday.txt with seven-weekday names to run the next script.
weekday.txt
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
Create a bash file named cmdsub1.sh with the following script. In this script, while loop is used to read the content of weekday.txt file line by line and read the first three characters of each line by using `cut` command. After cutting, the string value is stored in the variable $day. Next, If the statement is used to check the value of $day is ‘Sun’ or not. The output will print ‘Sunday is the holiday‘ when if the condition is true otherwise it will print the value of $day.
cmdsub1.sh
#!/bin/bash
filename
=
‘weekday.txt’
while read
line;
do
day
=
`echo $line | cut -c 1
–
3`
if [ $day
==
“Sun” ]
thenecho “Sunday is the holiday”
elseecho $day
fidone<$filename
Run the script.
$ cat weekday.txt
$ bash cmdsub1.sh
Output:
Using command substitution in loop
You can store the output of command substitution into any loop variable which is shown in the next example.
Example#5:
Create a file named cmdsub2.sh with the following code. Here, `ls -d */` command is used to retrieve all directory list from the current directory. For loop is used here to read each directory from the output and store it in the variable $dirname which is printed later.
cmdsub2.sh
#!/bin/bash
for dirname in
$
(ls -d */)
doecho “$dirname”
done
Run the script.
Output:
Using nested commands
How you can use multiple commands using pipe(|) is shown in the previous example. But you can use nested commands in command substitution where the output of the first command depends on the output of the second command and it works opposite of the pipe(|) command.
Nested command syntax:
var=`command1 `command“
Example#6:
Two commands, `echo` and `who` are used in this example as the nested command. Here, `who` command will execute first that print the user’s information of the currently logged in user. The output of the `who` command will execute by `echo` command and the output of `echo` will store into the variable $var. Here, the output of `echo` command depends on the output of `who` command.
$ var=`echo `who“
$ echo $var
Output:
Using Command path
If you know the path of the command then you can run the command by specifying the command path when using command substitution. The following example shows the use of command path.
Example#7:
`whoami` command shows the username of the currently logged in user. By default, this command is stored in /usr/bin/ folder. Run the following script to run `whoami` command using path and store in the variable, $output, and print the value of $output.
$ output=$(/usr/bin/whoami)
$ echo $output
Output:
Using Command Line argument
You can use the command line argument with the command as the argument in the command substitution.
Example#8:
Create a bash file named cmdsub3.sh with the following script. `basename` command is used here to retrieve the filename from the 2nd command line argument and stored in the variable, $filename. We know the 1st command line argument is the name of the executing script which is denoted by $0.
#!/bin/bash
filename=`basename $1`
echo “The name of the file is $filename.”
Run the script with the following argument value.
$ bash cmdsub3.sh Desktop/temp/hello.txt
Here, the basename of the path, Desktop/temp/hello.txt is ‘hello.txt’. So, the value of the $filename will be hello.txt.
Output:
Conclusion:
Various uses of command substitutions are shown in this tutorial. If you need to work with multiple commands or depended commands and store the result temporary to do some other tasks later then you can use this feature in your script to get the output.
Mark Shuttleworth is not selling Canonical or Ubuntu — yet
At OpenStack Summit in Berlin, Mark Shuttleworth, founder of Canonical and Ubuntu, said in his keynote the question he gets asked the most is “What does he make of IBM buying Red Hat?” His reply is that IBM had spent too much, but with the growth of the cloud it would probably work out for them.
Actually, the question most of us wanted him to answer is: “After IBM paid a cool $34-billion would he consider selling Canonical?” After all, Canonical is also a top Linux company with a arguably a much stronger cloud and container presence than Red Hat. By The Cloud Market’s latest count of Amazon Web Services (AWS) instances, Ubuntu dominates with 307,217 instances to Red Hat’s 20,311. Even so, in a show floor conversation, Shuttleworth said, “No, I value my independence.”
That’s not to say he’s not willing to listen to proposals. But he has his own vision for Canonical and Ubuntu Linux. If someone were to make him an offer, which would leave him in charge of both and help him further his plans, then he might go for it. Maybe.
It would have to be a heck of an offer though, even by post-Red Hat acquisition terms. Shuttleworth doesn’t need the money. What he wants is to make his mark in technology history.
Of course, that requires money. But he told me that Canonical has been slowly but surely winning over former Red Hat customers. In his keynote, Shuttleworth said the company has been winning many telecom customers and that now five out of the top twenty-five banks are using Ubuntu. Specifically, he mentioned, AT&T, CenturyLink, Deutsche Telekom, NTT Docomo, SoftBank, and Walmart as Canonical customers.
Clearly, Canonical isn’t hurting for cash. In any case, Shuttleworth still plans on a Canonical Initial Public Offering (IPO) in 2019.
So, for now Canonical, under Shuttleworth’s firm hand, will continue to go its own way.
Download PHP Linux 7.2.12 / 7.3.0 RC5
PHP is an open source software project, the most popular general-purpose scripting language crafted especially for web development. In theory, PHP is a hypertext preprocessor, but it’s actually a fast, pragmatic and flexible server-side programming language that helps you create powerful websites.
Can be embedded into HTML
While a skilled web developer can easily embed PHP into HTML, it can be used as a standalone executable. Its syntax draws upon the C, Java, and Perl. It easy to learn if you previously interact with any of the aforementioned programming languages.
Supports XML, IMAP, Java and LDAP
Being designed from the offset to be a universal web programming language, PHP offers support for XML, IMAP, Java, LDAP, several major databases, various Internet protocols, and general data manipulation.
Integrates into a web server
It’s called a server-side programming language because it integrates into a web server, such as Apache or Microsoft IIS. To add support for PHP to a web server, you can install the native web server module or a CGI executable.
It can access database and FTP servers
PHP is an Internet-aware system that can access database servers, including MySQL, PostgreSQL, SQLite, LDAP and Microsoft SQL Server, as well as FTP (File Transfer Protocol) servers.
It is highly extensible via its powerful APIs
PHP is actively developed in multiple stable and development branches, each one supporting various features and components. It is highly extensible via its powerful APIs (Application Programming Interfaces).
Supported operating systems and platforms
PHP is implemented in the C programming language, which means that it’s a cross-platform software supporting GNU/Linux, BSD, Solaris, Mac OS X or Microsoft Windows operating systems. It runs successfully on both 32-bit and 64-bit hardware platforms. It is freely available for download on any of the aforementioned OSes, distributed under the PHP license.
How to Do Deep Machine Learning Tasks Inside KVM Guests with a Passed-through NVIDIA GPU
This article shows how to run deep machine learning tasks in a SUSE Linux Enterprise Server 15 KVM guest. In a first step, you will learn how to do the train/test tasks using CPU and GPU separately. After that, we can compare the performance differences.
Preparation
But first of all, we need to do some preparation work before building both the Caffe and the TensorFlow frameworks with GPU support.
1- Enable vt-d in the host bios and ensure the kernel parameter ‘intel_iommu=on’ is enabled.
2- Pass the nv970GTX on to the SUSE Linux Enterprise Server 15 KVM guest through libvirt.
Note:
* If there are multiple devices in the same iommu group, you need to pass all of them on to the guest.
* What is passed-through is the 970GTX physical function, not a vGPU instance, because 970GTX is not vGPU capable.
3- Disable the visibility of KVM to the guest by hiding the KVM signature. Otherwise, the newer public NVIDIA drivers and tools refuse to work (Please refer to qemu commit#f522d2a for the details).
4- Install the official NVIDIA display driver in the guest:
5- Install Cuda 10, cuDNN 7.3.1 and NCCL 2.3.5 in the guest:
Build the Frameworks
Now it’s time to build the TensorFlow framework with GPU support and the Caffe framework.
As the existing whl package of TensorFlow 1.11 doesn’t support Cuda 10 yet, I built TensorFlow 1.12 from the official Git source.
As next step, build a whl package and install it.
Now let’s create a simple example to test the TensorFlow GPU in the guest:
Through the nvidia-smi command, you can see the process information on GPU0 while the example code is running.
Next, let’s build the Caffe framework from the source, and the Caffe python wrapper.
The setup is done!
Examples
Now let’s try to execute some deep learning tasks.
Example 1.1: This is a Caffe built-in example. Please refer to http://caffe.berkeleyvision.org/gathered/examples/mnist.html to learn more.
Let’s use GPU0 in a guest to train this LeNET model.
During the training progress, we should see that the loss rate presents the downward trend all the time along with continuous iteration. But as the output is too long, I will not show it here.
We got four files at the given folder after the training is done. This is because I set up the system to save the model and the training status every 5000 times. This means we get 2 files after 5000 iterations and 2 files after 10000 iterations.
Now we got a trained model. Let’s test it with 10000 test images to see how good the accuracy is.
See? The accuracy is 0.9844. It is an acceptable result.
Example 1.2: Now let’s re-train a LeNET model using CPU instead of GPU – and let’s see what happens.
When we compare the GPU and the CPU, we can see that there are huge performance differences, while we train/test the LeNET with the mnist dataset.
We know that the traditional LeNET convolutional neural network (CNN) contains seven layers. Except for the input layer, the MNIST database contains 60,000 training images and 10,000 testing images. That means the performance differences become more between the training by CPU and the training by GPU when using deeper neural network layers.
Example 2.1: This example is a TensorFlow built-in example. Let’s do a very simple mnist classifier using the same mnist dataset.
Here we go: As no convolutional layers are involved, the time consumed is quite short. It is only 8.5 seconds. But the accuracy is 0.92, which is not good enough.
If you want, you can check all details through the TensorBoard.
Example 2.2: Now we create a network with five layers CNN which is similar to the LeNET. Let’s re-train the system through GPU0 based on the TensorFlow framework.
You can see now that the accuracy is 0.99 – it got much better, and the time consumed is only 2m 16s.
Example 2.3: Finally, let’s redo example 2.2 with CPU instead of GPU0, to check the performance differences.
With 0.99, the accuracy is really good now. But the time consumed is 19m 53s, which is way longer than the time consumed in example 2.2.
Summary
Finally, let’s summarize our test results:
- The training/testing performance differences are huge between CPU and GPU. They could be going into the hundreds of times if the network model is complex.
- SUSE Linux Enterprise Server 15 is a highly reliable platform whatever Machine Learning tasks you want to run on it for research or production purposes.
AI in the Real World
Hilary Mason, general manager for machine learning at Cloudera, discussed AI in the real world in her keynote the recent Open FinTech Forum.
We are living in the future – it is just unevenly distributed with “an outstanding amount of hype and this anthropomorphization of what [AI] technology can actually provide for us,” observed Hilary Mason, general manager for machine learning at Cloudera, who led a keynote on “AI in the Real World: Today and Tomorrow,” at the recent Open FinTech Forum.
AI has existed as an academic field of research since the mid-1950s, and if the forum had been held 10 years ago, we would have been talking about big data, she said. But, today, we have machine learning and feedback loops that allow systems continue to improve with the introduction of more data.
Machine learning provides a set of techniques that fall under the broad umbrella of data science. AI has returned, from a terminology perspective, Mason said, because of the rise of deep learning, a subset of machine learning techniques based around neural networks that has provided not just more efficient capabilities but the ability to do things we couldn’t do at all five years ago.
Imagine the future
All of this “creates a technical foundation on which we can start to imagine the future,’’ she said. Her favorite machine learning application is Google Maps. Google is getting real-time data from people’s smartphones, then it is integrating that data with public data sets, so the app can make predictions based on historical data, she noted.
Getting this right, however, is really hard. Mason shared an anecdote about how her name is a “machine learning-edge case.” She shares her name with a British actress who passed away around 2005 after a very successful career.
Late in her career, the actress played the role of a ugly witch, and a search engine from 2009 combined photos with text results. At the time, Mason was working as a professor, and her bio was paired with the actress’s picture in that role. “Here she is, the ugly hag… and the implication here is obvious,’’ Mason said. “This named entity disambiguation problem is still a problem for us in machine learning in every domain.”
This example illustrates that “this technology has a tremendous amount of potential to make our lives more efficient, to build new products. But it also has limitations, and when we have conferences like this, we tend to talk about the potential, but not about the limitations, and not about where things tend to go a bit wrong.”
Machine learning in FinTech
Large companies operating complex businesses have a huge amount of human and technical expertise on where the ROI in machine learning would be, she said. That’s because they also have huge amounts of data, generally created as a result of operating those businesses for some time. Mason’s rule of thumb when she works with companies, is to find some clear ROI on a cost savings or process improvement using machine learning.
“Lots of people, in FinTech especially, want to start in security, anti-money laundering, and fraud detection. These are really fruitful areas because a small percentage improvement is very high impact.”
Other areas where machine learning can be useful is in understanding your customers, churn analysis and marketing techniques, all of which are pretty easy to get started in, she said.
“But if you only think about the ROI in the terms of cost reduction, you put a boundary on the amount of potential your use of AI will have. Think also about new revenue opportunities, new growth opportunities that can come out of the same technologies. That’s where the real potential is.”
Getting started
The first thing to do, she said is to “drink coffee, have ideas.” Mason said she visits lots of companies and when she sees their list of projects, they’re always good ideas. “I get very worried, because you are missing out on a huge amount of opportunity that would likely look like bad ideas on the surface.”
It’s important to “validate against robust criteria” and create a broad sweep of ideas. Then, go through and validate capabilities. Some of the questions to ask include: is there research activity relevant to what you’re doing? Is there work in one domain you can transfer to another domain? Has somebody done something in another industry that you can use or in an academic context that you can use?
Organizations also need to figure out whether systems are becoming commoditized in open source; meaning “you have a robust software and infrastructure you can build on without having to own and create it yourself.” Then, the organization must figure out if data is available — either within the company or available to purchase.
Then it’s time to “progressively explore the risky capabilities. That means have a phased investment plan,’’ Mason explained. In machine learning, this is done in three phases, starting with validation and exploration: Does the data exist? Can you build a very simple model in a week?
“At each [phase], you have a cost gate to make sure you’re not investing in things that aren’t ready and to make sure that your people are happy, making progress, and not going down little rabbit holes that are technically interesting, but ultimately not tied to the application.”
That said, Mason said predicting the future is of course, very hard, so people write reports on different technologies that are designed to be six months to two years ahead of what they would put in production.
Looking ahead
As progress is made in the development of AI, machine learning and deep learning, there are still things we need to keep in mind, Mason said. “One of the biggest topics in our field right now is how we incorporate ethics, how we comply with expectations of privacy in the practice of data science.”
She gave a plug to a short, free ebook called “Data Driven: Creating a Data Culture,” that she co-authored with DJ Patil, who worked as chief data scientist for President Barack Obama. Their goal, she said, is “to try and get folks who are practicing out in the world of machine learning and data science to think about their tools [and] for them to practice ethics in the context of their work.”
Mason ended her presentation on an optimistic note, observing that “AI will find its way into many fundamental processes of the businesses that we all run. So when I say, ‘Let’s make it boring,’ I actually think that’s what makes it more exciting.’”