{"id":1345,"date":"2019-02-18T12:38:08","date_gmt":"2019-02-18T12:38:08","guid":{"rendered":"https:\/\/www.appservgrid.com\/paw93\/?p=1345"},"modified":"2019-03-07T20:05:39","modified_gmt":"2019-03-07T20:05:39","slug":"docker-monitoring-continued-prometheus-and-sysdig","status":"publish","type":"post","link":"https:\/\/www.appservgrid.com\/paw93\/index.php\/2019\/02\/18\/docker-monitoring-continued-prometheus-and-sysdig\/","title":{"rendered":"Docker Monitoring Continued: Prometheus and Sysdig"},"content":{"rendered":"<p>I <a href=\"http:\/\/rancher.com\/comparing-monitoring-options-for-docker-deployments\/\">recently compared\u00a0<\/a>several docker monitoring tools and services. Since the article went live we have gotten feedback about additional tools that should be included in our survey. I would like to highlight two such tools;<br \/>\nPrometheus and Sysdig cloud. Prometheus is a capable self-hosted<br \/>\nsolution which is easier to manage than sensu. Sysdig cloud on the other<br \/>\nhand provides us with another hosted service much like Scout and<br \/>\nDatadog. Collectively they help us add more choices to their respective<br \/>\nclasses. As before I will be using the following six criteria to<br \/>\nevaluate Prometheus and Sysdig cloud: 1) ease of deployment, 2) level of<br \/>\ndetail of information presented, 3) level of aggregation of information<br \/>\nfrom entire deployment, 4) ability to raise alerts from the data and 5)<br \/>\nAbility to monitor non-docker resources 6) cost.<\/p>\n<h2>Prometheus<\/h2>\n<p>First lets take a look at Prometheus; it is a self-hosted set of tools<br \/>\nwhich collectively provide metrics storage, aggregation, visualization<br \/>\nand alerting. Most of the tools and services we have looked at so far<br \/>\nhave been push based, i.e. agents on the monitored servers talk to a<br \/>\ncentral server (or set of servers) and send out their metrics.<br \/>\nPrometheus on the other hand is a pull based server which expects<br \/>\nmonitored servers to provide a web interface from which it can scrape<br \/>\ndata. There are several <a href=\"http:\/\/prometheus.io\/docs\/instrumenting\/exporters\/\">exporters<br \/>\navailable<\/a> for<br \/>\nPrometheus which will capture metrics and then expose them over http for<br \/>\nPrometheus to scrape. In addition there are<br \/>\n<a href=\"http:\/\/prometheus.io\/docs\/instrumenting\/clientlibs\/\">libraries<\/a> which<br \/>\ncan be used to create custom exporters. As we are concerned with<br \/>\nmonitoring docker containers we will use the<br \/>\n<a href=\"https:\/\/github.com\/docker-infra\/container_exporter\">container_exporter<\/a><br \/>\ncapture metrics. Use the command shown below to bring up the<br \/>\ncontainer-exporter docker container and browse to<br \/>\n<em><a href=\"http:\/\/MONITORED_SERVER_IP:9104\/metrics\">http:\/\/MONITORED_SERVER_IP:9104\/metrics<\/a><\/em> to see the metrics it has<br \/>\ncollected for you. You should launch exporters on all servers in your<br \/>\ndeployment. Keep track of the respective *MONITORED_SERVER_IP*s as we<br \/>\nwill be using them later in the configuration for Prometheus.<\/p>\n<p>docker run -p 9104:9104 -v \/sys\/fs\/cgroup:\/cgroup -v \/var\/run\/docker.sock:\/var\/run\/docker.sock prom\/container-exporter<\/p>\n<p>Once we have got all our exporters running we are can launch Prometheus<br \/>\nserver. However, before we do we need to create a configuration file for<br \/>\nPrometheus that tells the server where to scrape the metrics from.<br \/>\nCreate a file called <em>prometheus.conf<\/em> and then add the following text<br \/>\ninside it.<\/p>\n<p>global:<br \/>\nscrape_interval: 15s<br \/>\nevaluation_interval: 15s<br \/>\nlabels:<br \/>\nmonitor: exporter-metrics<\/p>\n<p>rule_files:<\/p>\n<p>scrape_configs:<br \/>\n&#8211; job_name: prometheus<br \/>\nscrape_interval: 5s<\/p>\n<p>target_groups:<br \/>\n# These endpoints are scraped via HTTP.<br \/>\n&#8211; targets: [&#8216;localhost:9090&#8242;,&#8217;MONITORED_SERVER_IP:9104&#8217;]<\/p>\n<p>In this file there are two sections, global and job(s). In the global<br \/>\nsection we set defaults for configuration properties such as data<br \/>\ncollection interval (scrape_interval). We can also add labels which<br \/>\nwill be appended to all metrics. In the jobs section we can define one<br \/>\nor more jobs that each have a name, an optional override scraping<br \/>\ninterval as well as one or more targets from which to scrape metrics. We<br \/>\nare adding two targets, one is the Prometheus server itself and the<br \/>\nsecond is the container-exporter we setup earlier. If you setup more<br \/>\nthan one exporter your can setup additional targets to pull metrics from<br \/>\nall of them. Note that the job name is available as a label on the<br \/>\nmetric hence you may want to setup separate jobs for your various types<br \/>\nof servers. Now that we have a configuration file we can start a<br \/>\nPrometheus server using the<br \/>\n<a href=\"https:\/\/registry.hub.docker.com\/u\/prom\/prometheus\/\">prom\/prometheus<\/a><br \/>\ndocker image.<\/p>\n<p>docker run -d &#8211;name prometheus-server -p 9090:9090 -v $PWD\/prometheus.conf:\/prometheus.conf prom\/prometheus -config.file=\/prometheus.conf<\/p>\n<p>After launching the container, Prometheus server should be available in<br \/>\nyour browser on the port 9090 in a few moments. Select <em>Graph<\/em> from the<br \/>\ntop menu and select a metric from the drop down box to view its latest<br \/>\nvalue. You can also write queries in the expression box which can find<br \/>\nmatching metrics. Queries take the form<br \/>\nMETRIC_NAME. You can find more<br \/>\ndetails of the query syntax <a href=\"http:\/\/prometheus.io\/docs\/querying\/\">here<\/a>.<\/p>\n<p><img decoding=\"async\" src=\"http:\/\/i.imgur.com\/k0n8b9k.png\" alt=\"\" \/><\/p>\n<p>We are able to drill down into the data using queries to filter out data<br \/>\nfrom specific server types (jobs) and containers. All metrics from<br \/>\ncontainers are labeled with the image name, container name and the host<br \/>\non which the container is running. Since metric names do not encompass<br \/>\ncontainer or server name we are able to easily aggregate data across<br \/>\nour deployment. For example we can filter for the<br \/>\ncontainer_memory_usage_bytes to get<br \/>\ninformation about the memory usage of all ubuntu containers in our<br \/>\ndeployment. Using the built in functions we can also aggregate the<br \/>\nresulting set of of metrics. For example<br \/>\naverage_over_time(container_memory_usage_bytes<br \/>\n[5m]) will show the memory used by ubuntu<br \/>\ncontainers, averaged over the last five minutes. Once you are happy with<br \/>\nwith a query you can click over to the Graph tab and see the variation<br \/>\nof the metric over time.<\/p>\n<p><img decoding=\"async\" src=\"http:\/\/i.imgur.com\/DgrLYLl.png\" alt=\"\" \/><\/p>\n<p>Temporary graphs are great for ad-hoc investigations but you also need<br \/>\nto have persistent graphs for dashboards. For this you can use the<br \/>\n<a href=\"https:\/\/registry.hub.docker.com\/u\/prom\/promdash\/\">Prometheus Dashboard<br \/>\nBuilder<\/a>. To launch<br \/>\nPrometheus Dashboard Builder you need access to an SQL database which<br \/>\nyou can create using the official MySQL <a href=\"https:\/\/registry.hub.docker.com\/_\/mysql\/\">Docker<br \/>\nimage<\/a>. The command to launch<br \/>\nthe MySQL container is shown below, note that you may select any value<br \/>\nfor database name, user name, user password and root password however<br \/>\nkeep track of these values as they will be needed later.<\/p>\n<p>docker run -p 3306:3306 &#8211;name promdash-mysql<br \/>\n-e MYSQL_DATABASE=&lt;database-name&gt;<br \/>\n-e MYSQL_USER=&lt;database-user&gt;<br \/>\n-e MYSQL_PASSWORD=&lt;user-password&gt;<br \/>\n-e MYSQL_ROOT_PASSWORD=&lt;root-password&gt;<br \/>\n-d mysql<\/p>\n<p>Once you have the database setup, use the rake installation inside the<br \/>\npromdash container to initialize the database. You can then run the<br \/>\nDashboard builder by running the same container. The command to<br \/>\ninitialize the database and bring up the Prometheus Dashboard Builder<br \/>\nare shown below.<\/p>\n<p># Initialize Database<br \/>\ndocker run &#8211;rm -it &#8211;link promdash-mysql:db<br \/>\n-e DATABASE_URL=mysql2:\/\/&lt;database-user&gt;:&lt;user-password&gt;@db:3306\/&lt;database-name&gt; prom\/promdash .\/bin\/rake db:migrate<\/p>\n<p># Run Dashboard<br \/>\ndocker run -d &#8211;link promdash-mysql:db -p 3000:3000 &#8211;name prometheus-dash<br \/>\n-e DATABASE_URL=mysql2:\/\/&lt;database-user&gt;:&lt;user-password&gt;@db:3306\/&lt;database-name&gt; prom\/promdash<\/p>\n<p><img decoding=\"async\" src=\"http:\/\/i.imgur.com\/JCwRhwx.png\" alt=\"\" \/><\/p>\n<p>Once your container is running you can browse to port 3000 and load up<br \/>\nthe dashboard builder UI. In the UI you need to click <em>Servers<\/em> in the<br \/>\ntop menu and <em>New Server<\/em> to add your Prometheus Server as a datasource<br \/>\nfor the dashboard builder. Add <em><a href=\"http:\/\/PROMETHEUS_SERVER_IP:9090\">http:\/\/PROMETHEUS_SERVER_IP:9090<\/a><\/em> to<br \/>\nthe list of servers and hit <em>Create Server<\/em>.<\/p>\n<p>Now click <em>Dashboards<\/em> in the top menu, here you can create<br \/>\n<em>Directories<\/em> (Groups of Dashboards) and <em>Dashboards<\/em>. For example we<br \/>\ncreated a directory for Web Nodes and one for Database Nodes and in each<br \/>\nwe create a dashboard as shown below.<\/p>\n<p><img decoding=\"async\" src=\"http:\/\/i.imgur.com\/ntOQORp.png\" alt=\"\" \/><\/p>\n<p>Once you have created a dashboard you can add metrics by mousing over<br \/>\nthe title bar of a graph and selecting the data sources icon (Three<br \/>\nHorizontal lines with an addition sign following them ). You can then<br \/>\nselect the server which you added earlier, and a query expression which<br \/>\nyou tested in the Prometheus Server UI. You can add multiple data<br \/>\nsources into the same graph in order to see a comparative view.<\/p>\n<p><img decoding=\"async\" src=\"http:\/\/i.imgur.com\/BQM3rkG.png\" alt=\"\" \/><\/p>\n<p>You can add multiple graphs (each with possibly multiple data sources)<br \/>\nby clicking the Add Graph button. In addition you may select the<br \/>\ntime range over which your dashboard displays data as well as a refresh<br \/>\ninterval for auto-loading data. The dashboard is not as polished as the<br \/>\nones from Scout and DataDog, for example there is no easy way to explore<br \/>\nmetrics or build a query in the dashboard view. Since the dashboard runs<br \/>\nindependently of the Prometheus server we can\u2019t \u2018pin\u2019 graphs<br \/>\ngenerated in the Prometheus server into a dashboard. Furthermore several<br \/>\ntimes we noticed that the UI would not update based on selected data<br \/>\nuntil we refreshed the page. However, despite its issues the dashboard<br \/>\nis feature competitive with DataDog and because Prometheus is under<br \/>\nheavy development, we expect the bugs to be resolved over time. In<br \/>\ncomparison to other self-hosted solutions Prometheus is a lot more user<br \/>\nfriendly than Sensu and allows you present metric data as graphs without<br \/>\nusing third party visualizations. It also is able to provide much better<br \/>\nanalytical capabilities than CAdvisor.<\/p>\n<p>Prometheus also has the ability to apply alerting rules over the input<br \/>\ndata and displaying those on the UI. However, to be able to do something<br \/>\nuseful with alerts such send emails or notify<br \/>\n<a href=\"http:\/\/www.pagerduty.com\/\">pagerduty<\/a> we need to run the the <a href=\"https:\/\/registry.hub.docker.com\/u\/prom\/alertmanager\/\">Alert<br \/>\nManager<\/a>. To run<br \/>\nthe Alert Manager you first need to create a configuration file. Create<br \/>\na file called <em>alertmanager.conf<\/em> and add the following text into it:<\/p>\n<p>notification_config {<br \/>\nname: &#8220;ubuntu_notification&#8221;<br \/>\npagerduty_config {<br \/>\nservice_key: &#8220;&lt;PAGER_DUTY_API_KEY&gt;&#8221;<br \/>\n}<br \/>\nemail_config {<br \/>\nemail: &#8220;&lt;TARGET_EMAIL_ADDRESS&gt;&#8221;<br \/>\n}<br \/>\nhipchat_config {<br \/>\nauth_token: &#8220;&lt;HIPCHAT_AUTH_TOKEN&gt;&#8221;<br \/>\nroom_id: 123456<br \/>\n}<br \/>\n}<br \/>\naggregation_rule {<br \/>\nfilter {<br \/>\nname_re: &#8220;image&#8221;<br \/>\nvalue_re: &#8220;ubuntu:14.04&#8221;<br \/>\n}<br \/>\nrepeat_rate_seconds: 300<br \/>\nnotification_config_name: &#8220;ubuntu_notification&#8221;<br \/>\n}<\/p>\n<p>In this configuration we are creating a notification configuration<br \/>\ncalled <em>ubuntu_notification<\/em>, which specifies that alerts must go to<br \/>\nthe PagerDuty, Email and HipChat. We need to specify the relevant API<br \/>\nkeys and\/or access tokens for the HipChat and PagerDutyNotifications to<br \/>\nwork. We are also specifying that the alert configuration should only<br \/>\napply to alerts on metrics where the label image has the value<br \/>\nubuntu:14.04. We specify that a triggered alert should not retrigger<br \/>\nfor at least 300 seconds after the first alert is raised. We can bring<br \/>\nup the Alert Manager using the docker image by volume mounting our<br \/>\nconfiguration file into the container using the command shown below.<\/p>\n<p>docker run -d -p 9093:9093 -v $PWD:\/alertmanager prom\/alertmanager -logtostderr -config.file=\/alertmanager\/alertmanager.conf<\/p>\n<p>Once the container is running you should be able to point your browser<br \/>\nto port 9093 and load up the Alarm Manger UI. You will be able to see<br \/>\nall the alerts raised here, you can \u2018silence\u2019 them or delete them once<br \/>\nthe issue is resolved. In addition to setting up the Alert Manager we<br \/>\nalso need to create a few alerts. Add rule_file:<br \/>\n\u201c\/prometheus.rules\u201d in a new line into the global section of the<br \/>\n<em>prometheus.conf<\/em> file you created earlier. This line tells Prometheus<br \/>\nto look for alerting rules in the <em>prometheus.rules<\/em> file. We now need<br \/>\nto create the rules file and load it into our server container. To do so<br \/>\ncreate a file called <em>prometheus.rules<\/em> in the same directory where you<br \/>\ncreated <em>prometheus.conf<\/em>. and add the following text to it:<\/p>\n<p>ALERT HighMemoryAlert<br \/>\nIF container_memory_usage_bytes &gt; 1000000000<br \/>\nFOR 1m<br \/>\nWITH {}<br \/>\nSUMMARY &#8220;High Memory usage for Ubuntu container&#8221;<br \/>\nDESCRIPTION &#8220;High Memory usage for Ubuntu container on {{$labels.instance}} for container {{$labels.name}} (current value: {{$value}})&#8221;<\/p>\n<p>In this configuration we are telling Prometheus to raise an alert called<br \/>\nHighMemoryAlert if the container_memory_usage_bytes metric<br \/>\nfor containers using the Ubuntu:14.04 image goes above 1 GB for 1<br \/>\nminute. The summary and the description of the alerts is also specified<br \/>\nin the rules file. Both of these fields can contain placeholders for<br \/>\nlabel values which are replaced by Prometheus. For example our<br \/>\ndescription will specify the server instance (IP) and the container name<br \/>\nfor metric raising the alert. After launching the Alert Manager and<br \/>\ndefining your Alert rules, you will need to re-run your Prometheus<br \/>\nserver with new parameters. The commands to do so are below:<\/p>\n<p># stop and remove current container<br \/>\ndocker stop prometheus-server &amp;&amp; docker rm prometheus-server<\/p>\n<p># start new container<br \/>\ndocker run -d &#8211;name prometheus-server -p 9090:9090<br \/>\n-v $PWD\/prometheus.conf:\/prometheus.conf<br \/>\n-v $PWD\/prometheus.rules:\/prometheus.rules<br \/>\nprom\/prometheus<br \/>\n-config.file=\/prometheus.conf<br \/>\n-alertmanager.url=http:\/\/ALERT_MANAGER_IP:9093<\/p>\n<p>Once the Prometheus Server is up again you can click Alerts in the top<br \/>\nmenu of the Prometheus Server UI to bring up a list of alerts and their<br \/>\nstatuses. If and when an alert is fired you will also be able to see it<br \/>\nin the Alert Manager UI and any external service defined in the<br \/>\n<em>alertmanager.conf<\/em> file.<\/p>\n<p><img decoding=\"async\" src=\"http:\/\/i.imgur.com\/uC483G5.png\" alt=\"\" \/><\/p>\n<p>Collectively the Prometheus tool-set\u2019s feature set is on par with<br \/>\nDataDog which has been our best rated Monitoring tool so far. Prometheus<br \/>\nuses a very simple format for input data and can ingest from any web<br \/>\nendpoint which presents the data. Therefore we can monitor more or less<br \/>\nany resource with Prometheus, and there are already several libraries<br \/>\ndefined to monitor common resources. Where Prometheus is lacking is in<br \/>\nlevel of polish and ease of deployment. The fact that all components are<br \/>\ndockerized is a major plus however, we had to launch 4 different<br \/>\ncontainers each with their own configuration files to support the<br \/>\nPrometheus server. The project is also lacking detailed, comprehensive<br \/>\ndocumentation for these various components. However, in caparison to<br \/>\nself-hosted services such as CAdvisor and Sensu, Prometheus is a much<br \/>\nbetter toolset. It is significantly easier setup than sensu and has the<br \/>\nability to provide visualization of metrics without third party tools.<br \/>\nIt is able has much more detailed metrics than CAdvisor and is also able<br \/>\nto monitor non-docker resources. The choice of using pull based metric<br \/>\naggregation rather than push is less than ideal as you would have to<br \/>\nrestart your server when adding new data sources. This could get<br \/>\ncumbersome in a dynamic environment such as cloud based deployments.<br \/>\nPrometheus does offer the <a href=\"https:\/\/github.com\/prometheus\/pushgateway\">Push<br \/>\nGateway<\/a> to bridge the<br \/>\ndisconnect. However, running yet another service will add to the<br \/>\ncomplexity of the setup. For these reasons I still think DataDog is<br \/>\nprobably easier for most users, however, with some polish and better<br \/>\npackaging Prometheus could be a very compelling alternative, and out of<br \/>\nself-hosted solutions Prometheus is my pick.<\/p>\n<p>Score Card:<\/p>\n<ol>\n<li>Easy of deployment: **<\/li>\n<li>Level of detail: *****<\/li>\n<li>Level of aggregation: *****<\/li>\n<li>Ability to raise alerts: ****<\/li>\n<li>Ability to monitor non-docker resources: Supported<\/li>\n<li>Cost: Free<\/li>\n<\/ol>\n<h2>Sysdig Cloud<\/h2>\n<p>Sysdig cloud is a hosted service that provides metrics storage,<br \/>\naggregation, visualization and alerting. To get started with sysdig sign<br \/>\nup for a trial account at <a href=\"https:\/\/app.sysdigcloud.com\">https:\/\/app.sysdigcloud.com<\/a>. and complete<br \/>\nthe registration form. Once you complete the registration form and log<br \/>\nin to the account, you will be asked to <em>Setup your Environment<\/em> and be<br \/>\ngiven a curl command similar to the shown below. Your command will have<br \/>\nyour own secret key after the -s switch. You can run this command on the<br \/>\nhost running docker and which you need to monitor. Note that you should<br \/>\nreplace the [TAGS] place holder with tags to group your metrics. The<br \/>\ntags are in the format TAG_NAME:VALUE so you may want to add a tag<br \/>\nrole:web or deployment:production. You may also use the containerized<br \/>\nsysdig agent.<\/p>\n<p># Host install of sysdig agent<br \/>\ncurl -s https:\/\/s3.amazonaws.com\/download.draios.com\/stable\/install-agent | sudo bash -s 12345678-1234-1234-1234-123456789abc [TAGS]<\/p>\n<p># Docker based sysdig agent<br \/>\ndocker run &#8211;name sysdig-agent &#8211;privileged &#8211;net host<br \/>\n-e ACCESS_KEY=12345678-1234-1234-1234-123456789abc<br \/>\n-e TAGS=os:rancher<br \/>\n-v \/var\/run\/docker.sock:\/host\/var\/run\/docker.sock<br \/>\n-v \/dev:\/host\/dev -v \/proc:\/host\/proc:ro<br \/>\n-v \/boot:\/host\/boot:ro<br \/>\n-v \/lib\/modules:\/host\/lib\/modules:ro<br \/>\n-v \/usr:\/host\/usr:ro sysdig\/agent<\/p>\n<p>Even if you use docker you will still need to install Kernel headers in<br \/>\nthe host OS. This goes against Docker\u2019s philosophy of isolated micro<br \/>\nservices. However, installing kernel headers is fairly benign.<br \/>\nInstalling the headers and getting sysdig running is trivial if you are<br \/>\nusing a mainstream kernel such us CentOS, Ubuntu or Debian. Even the<br \/>\nAmazon\u2019s custom kernels are supported however RancherOS\u2019s custom<br \/>\nkernel presented problems for sysdig as did the tinycore kernel. So be<br \/>\nwarned if you would like to use Sysdig cloud on non-mainstream kernels<br \/>\nyou may have to get your hands dirty with some system hacking.<\/p>\n<p>After you run the agent you should see the Host in the Sysdig cloud<br \/>\nconsole in the Explore tab. Once you launch docker containers on the<br \/>\nhost those will also be shown. You can see basic stats about the CPU<br \/>\nusage, memory consumption, network usage. The metrics are aggregated for<br \/>\nthe host as well as broken down per container.<\/p>\n<p><a href=\"http:\/\/cdn.rancher.com\/wp-content\/uploads\/2015\/04\/20181622\/Screen-Shot-2015-04-14-at-12.06.36-PM.png\"><img decoding=\"async\" src=\"http:\/\/cdn.rancher.com\/wp-content\/uploads\/2015\/04\/20181622\/Screen-Shot-2015-04-14-at-12.06.36-PM.png\" alt=\"Screen Shot 2015-04-14 at 12.06.36\nPM\" \/><\/a>By<br \/>\nselecting one of the hosts or containers you can get a whole host of<br \/>\nother metrics including everything provided by the docker stats API. Out<br \/>\nof all the systems we have seen so far sysdig certainly has the most<br \/>\ncomprehensive set of metrics out of the box. You can also select from<br \/>\nseveral pre-configured dashboards which present a graphical or tabular<br \/>\nrepresentation of your deployment.<\/p>\n<p><a href=\"http:\/\/cdn.rancher.com\/wp-content\/uploads\/2015\/04\/20181622\/Screen-Shot-2015-04-16-at-11.26.53-AM.png\"><img decoding=\"async\" src=\"http:\/\/cdn.rancher.com\/wp-content\/uploads\/2015\/04\/20181622\/Screen-Shot-2015-04-16-at-11.26.53-AM.png\" alt=\"Screen Shot 2015-04-16 at 11.26.53\nAM\" \/><\/a><\/p>\n<p>You can see live metrics, by selecting Real-time Mode (Target Icon)<br \/>\nor select a window of time over which to average values. Furthermore,<br \/>\nyou can also setup comparisons which will highlight the delta of current<br \/>\nvalues and values at a point in the past. For example the table below<br \/>\nshows values compared with those from ten minutes ago. If the CPU usage<br \/>\nis significantly higher than 10 minutes ago you may be experiencing load<br \/>\nspikes and need to scale out. The UI is at par with, if not better than<br \/>\nDataDog for identifying and exploring trends in the data.<a href=\"http:\/\/cdn.rancher.com\/wp-content\/uploads\/2015\/04\/20181622\/Screen-Shot-2015-04-19-at-4.59.09-PM.png\"><img decoding=\"async\" src=\"http:\/\/cdn.rancher.com\/wp-content\/uploads\/2015\/04\/20181622\/Screen-Shot-2015-04-19-at-4.59.09-PM.png\" alt=\"Screen Shot\n2015-04-19 at 4.59.09\nPM\" \/><\/a><\/p>\n<p>In addition to exploring data on an ad-hoc basis you can also create<br \/>\npersistent dashboards. Simply click the pin icon on any graph in the<br \/>\nexplore view and save it to a named dashboard. You can view all the<br \/>\ndashboards and their associated graphs by clicking the Dashboards<br \/>\ntab. You can also select the bell icon on any graph and create an<br \/>\nalert from the data. The Sysdig cloud supports detailed alerting<br \/>\ncriteria and is again one of the best we have seen. The example below<br \/>\nshows an alert which triggers if the count of containers labeled <em>web<\/em><br \/>\nfalls below three on average for the last ten minutes. We are also<br \/>\nsegmenting the data by the <em>region<\/em> tag, so there will be a separate<br \/>\ncheck for web nodes in North America and Europe. Lastly, we also specify<br \/>\na Name, description and Severity for the alerts. You can control where<br \/>\nalerts go by going to Settings (Gear Icon) &gt; Notifications and add<br \/>\nemail addresses or SNS Topics to send alerts too. Note all alerts go to<br \/>\nall notification endpoints which may be problematic if you want to wake<br \/>\nup different people for different alerts.<a href=\"http:\/\/cdn.rancher.com\/wp-content\/uploads\/2015\/04\/20181622\/Screen-Shot-2015-04-19-at-4.55.35-PM.png\"><img decoding=\"async\" src=\"http:\/\/cdn.rancher.com\/wp-content\/uploads\/2015\/04\/20181622\/Screen-Shot-2015-04-19-at-4.55.35-PM.png\" alt=\"Screen Shot 2015-04-19 at\n4.55.35\nPM\" \/><\/a><\/p>\n<p>I am very impressed with Sysdig cloud as it was trivially easy to setup,<br \/>\nprovides detailed metrics with great visualization tools for real-time<br \/>\nand historical data. The requirement to install kernel headers on the<br \/>\nhost OS is troublesome though and lack of documentation and support for<br \/>\nnon-standard kernels could be problematic in some scenarios. The<br \/>\nalerting system in the Sysdig cloud is among the best we have seen so<br \/>\nfar, however, the inability to target different email addresses for<br \/>\ndifferent alerts is problematic. In a larger team for example you would<br \/>\nwant to alert a different team for database issues vs web server issues.<br \/>\nLastly, since it is in beta the pricing for Sysdig cloud is not easily<br \/>\navailable. I have reached out to their sales team and will update this<br \/>\narticle if and when they get back to me. If sysdig is price competitive<br \/>\nthen Datadog has serious competition in the hosted service category.<\/p>\n<p>Score Card:<\/p>\n<ol>\n<li>Easy of deployment: ***<\/li>\n<li>Level of detail: *****<\/li>\n<li>Level of aggregation: *****<\/li>\n<li>Ability to raise alerts: ****<\/li>\n<li>Ability to monitor non-docker resources: Supported<\/li>\n<li>Cost: Must Contact Support<\/li>\n<\/ol>\n<p><a href=\"https:\/\/rancher.com\/docker-monitoring-continued-prometheus-and-sysdig\/\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I recently compared\u00a0several docker monitoring tools and services. Since the article went live we have gotten feedback about additional tools that should be included in our survey. I would like to highlight two such tools; Prometheus and Sysdig cloud. Prometheus is a capable self-hosted solution which is easier to manage than sensu. Sysdig cloud on &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.appservgrid.com\/paw93\/index.php\/2019\/02\/18\/docker-monitoring-continued-prometheus-and-sysdig\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Docker Monitoring Continued: Prometheus and Sysdig&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-1345","post","type-post","status-publish","format-standard","hentry","category-kubernetes"],"_links":{"self":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/1345","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/comments?post=1345"}],"version-history":[{"count":1,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/1345\/revisions"}],"predecessor-version":[{"id":1420,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/1345\/revisions\/1420"}],"wp:attachment":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/media?parent=1345"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/categories?post=1345"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/tags?post=1345"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}