{"id":291,"date":"2018-10-16T07:10:08","date_gmt":"2018-10-16T07:10:08","guid":{"rendered":"https:\/\/www.appservgrid.com\/paw93\/?p=291"},"modified":"2018-10-16T10:05:51","modified_gmt":"2018-10-16T10:05:51","slug":"docker-features-for-handling-containers-death-and-resurrection","status":"publish","type":"post","link":"https:\/\/www.appservgrid.com\/paw93\/index.php\/2018\/10\/16\/docker-features-for-handling-containers-death-and-resurrection\/","title":{"rendered":"Docker features for handling Container\u2019s death and resurrection"},"content":{"rendered":"<p>Docker containers provides an isolated sandbox for the containerized program to execute. One-shot containers accomplishes a particular task and stops. Long running containers runs for an indefinite period till it either gets stopped by the user or when the root process inside container crashes. It is necessary to gracefully handle container\u2019s death and to make sure that the Job running as container does not get impacted in an unexpected manner. When containers are run with Swarm orchestration, Swarm monitors the containers health, exit status and the entire lifecycle including upgrade and rollback. This will be a pretty long blog. I did not want to split it since it makes sense to look at this holistically. You can jump to specific sections by clicking on the links below if needed. In this blog, I will cover the following topics with examples:<\/p>\n<ul>\n<li><a href=\"https:\/\/sreeninet.wordpress.com\/2017\/08\/15\/docker-features-for-handling-containers-death-and-resurrection\/#signalexit\">Handling Signals and exit codes<\/a><\/li>\n<li><a href=\"https:\/\/sreeninet.wordpress.com\/2017\/08\/15\/docker-features-for-handling-containers-death-and-resurrection\/#containerrestart\">Container restart policy<\/a><\/li>\n<li><a href=\"https:\/\/sreeninet.wordpress.com\/2017\/08\/15\/docker-features-for-handling-containers-death-and-resurrection\/#containerhealthcheck\">Container health check<\/a><\/li>\n<li><a href=\"https:\/\/sreeninet.wordpress.com\/2017\/08\/15\/docker-features-for-handling-containers-death-and-resurrection\/#servicerestart\">Service restart with Swarm<\/a><\/li>\n<li><a href=\"https:\/\/sreeninet.wordpress.com\/2017\/08\/15\/docker-features-for-handling-containers-death-and-resurrection\/#servicehealth\">Service health check<\/a><\/li>\n<li><a href=\"https:\/\/sreeninet.wordpress.com\/2017\/08\/15\/docker-features-for-handling-containers-death-and-resurrection\/#serviceupgrade\">Service rolling upgrade and rollback<\/a><\/li>\n<\/ul>\n<h3>Handling Signals and exit codes<\/h3>\n<p>When we pass a signal to container using Docker CLI, Docker passes the signal to the main process running inside container(PID-1). This <a href=\"http:\/\/www.comptechdoc.org\/os\/linux\/programming\/linux_pgsignals.html\">link <\/a>has the list of all Linux signals. <a href=\"https:\/\/docs.docker.com\/engine\/reference\/run\/#exit-status\">Docker exit codes<\/a> follow the <a href=\"http:\/\/tldp.org\/LDP\/abs\/html\/exitcodes.html\">chroot exit standard<\/a> for Docker defined exit codes. Other standard exit codes can come from the program running inside container. Container exit code can be seen from container events coming from Docker daemon when the container exits. For containers that have not been cleaned up, exit code can be found from \u201cdocker ps -a\u201d.<br \/>\nFollowing is a sample \u201cdocker ps -a\u201d output where nginx container exited with exit code 0. Here, I used \u201cdocker stop\u201d to stop the container.<\/p>\n<p>$ docker ps -a<br \/>\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br \/>\n32d675260384 nginx &#8220;nginx -g &#8216;daemon &#8230;&#8221; 18 seconds ago Exited (0) 7 seconds ago web<\/p>\n<p>Following is a sample \u201cdocker ps -a\u201d output where nginx container exited with exit code 137. Here, I used \u201cdocker kill\u201d to stop the container.<\/p>\n<p>$ docker ps -a<br \/>\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br \/>\n9b5d8348cb89 nginx &#8220;nginx -g &#8216;daemon &#8230;&#8221; 11 seconds ago Exited (137) 2 seconds ago web<\/p>\n<p>Following is the list of standard and Docker defined exit codes:<\/p>\n<p>0: Success<br \/>\n125: Docker run itself fails<br \/>\n126: Contained command cannot be invoked<br \/>\n127: Containerd command cannot be found<br \/>\n128 + n: Fatal error signal n:<br \/>\n130: (128+2) Container terminated by Control-C<br \/>\n137: (128+9) Container received a SIGKILL<br \/>\n143: (128+15) Container received a SIGTERM<br \/>\n255: Exit status out of range(-1)<\/p>\n<p>Following is a simple Python program that handles Signals. This program will be run as Docker container to illustrate Docker signals and exit codes.<\/p>\n<p>#!\/usr\/bin\/python<\/p>\n<p>import sys<br \/>\nimport signal<br \/>\nimport time<\/p>\n<p>def signal_handler_int(sigid, frame):<br \/>\nprint &#8220;signal&#8221;, sigid, &#8220;,&#8221;, &#8220;Handling Ctrl+C\/SIGINT!&#8221;<br \/>\nsys.exit(signal.SIGINT)<\/p>\n<p>def signal_handler_term(sigid, frame):<br \/>\nprint &#8220;signal&#8221;, sigid, &#8220;,&#8221;, &#8220;Handling SIGTERM!&#8221;<br \/>\nsys.exit(signal.SIGTERM)<\/p>\n<p>def signal_handler_usr(sigid, frame):<br \/>\nprint &#8220;signal&#8221;, sigid, &#8220;,&#8221;, &#8220;Handling SIGUSR1!&#8221;<br \/>\nsys.exit(0)<\/p>\n<p>def main():<br \/>\n# Register signal handler<br \/>\nsignal.signal(signal.SIGINT, signal_handler_int)<br \/>\nsignal.signal(signal.SIGTERM, signal_handler_term)<br \/>\nsignal.signal(signal.SIGUSR1, signal_handler_usr)<\/p>\n<p>while True:<br \/>\nprint &#8220;I am alive&#8221;<br \/>\nsys.stdout.flush()<br \/>\ntime.sleep(1)<\/p>\n<p># This is the standard boilerplate that calls the main() function.<br \/>\nif __name__ == &#8216;__main__&#8217;:<br \/>\nmain()<\/p>\n<p>Following is the Dockerfile to convert this to container:<\/p>\n<p>FROM python:2.7<br \/>\nCOPY .\/signalexample.py .\/signalexample.py<br \/>\nENTRYPOINT [&#8220;python&#8221;, &#8220;signalexample.py&#8221;]<\/p>\n<p>Lets build the container:<\/p>\n<p>docker build &#8211;no-cache -t smakam\/signaltest:v1 .<\/p>\n<p>Lets start the container:<\/p>\n<p>docker run -d &#8211;name signaltest smakam\/signaltest:v1<\/p>\n<p>We can watch the logs from container using docker logs:<\/p>\n<p>docker logs -f signaltest<\/p>\n<p>The Python program above handles SIGINT, SIGTERM and SIGUSR1. We can pass these signals to the container using Docker CLI.<br \/>\nFollowing command sends SIGINT to the container:<\/p>\n<p>docker kill &#8211;signal=SIGINT signaltest<\/p>\n<p>In the Docker logs, we can see the following to show that this signal is handled:<\/p>\n<p>signal 2 , Handling Ctrl+C\/SIGINT!<\/p>\n<p>Following output shows the container exit status:<\/p>\n<p>$ docker ps -a<br \/>\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br \/>\nc06266e79a43 smakam\/signaltest:v1 &#8220;python signalexam&#8230;&#8221; 36 seconds ago Exited (2) 3 seconds ago signaltest<\/p>\n<p>Following command sends SIGTERM to the container:<\/p>\n<p>docker kill &#8211;signal=SIGTERM signaltest<\/p>\n<p>In the Docker logs, we can see the following to show that this signal is handled:<\/p>\n<p>signal 15 , Handling SIGTERM!<\/p>\n<p>Following output shows the container exit status:<\/p>\n<p>$ docker ps -a<br \/>\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br \/>\n0149708f42b2 smakam\/signaltest:v1 &#8220;python signalexam&#8230;&#8221; 10 seconds ago Exited (15) 2 seconds ago signaltest<\/p>\n<p>Following command sends SIGUSR1 to the container:<\/p>\n<p>docker kill &#8211;signal=SIGUSR1 signaltest<\/p>\n<p>In the Docker logs, we can see the following to show that this signal is handled:<\/p>\n<p>signal 15 , Handling SIGUSR1!<\/p>\n<p>Following output shows the container exit status:<\/p>\n<p>$ docker ps -a<br \/>\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br \/>\nc92f7b4dd45b smakam\/signaltest:v1 &#8220;python signalexam&#8230;&#8221; 12 seconds ago Exited (0) 2 seconds ago signaltest<\/p>\n<p>When we execute \u201cdocker stop \u201c, Docker first sends SIGTERM signal to the container, waits for some time and then sends SIGKILL. This is done so that the program executing inside the container can use the SIGTERM signal to do the graceful shutdown of the program.<\/p>\n<h4>Common mistake in Docker signal handling<\/h4>\n<p>In the above example, the python program runs as PID 1 inside container since we used the EXEC form of ENTRYPOINT in Dockerfile. If we use the background method of ENTRYPOINT, shell process runs as PID 1 and the python program runs as another process. Following is a sample Dockerfile for starting the program as background process.<\/p>\n<p>FROM python:2.7<br \/>\nCOPY .\/signalexample.py .\/signalexample.py<br \/>\nENTRYPOINT python signalexample.py<\/p>\n<p>In this example, Docker passes the signal to the shell process instead of to the Python program. This causes the python program to not see the signal sent to the container. If there are multiple processes running inside the container and we need to pass the signal, 1 possible approach is to run the ENTRYPOINT as a script, handle the signal in the script and pass it to the correct process. 1 example using this approach is mentioned <a href=\"https:\/\/medium.com\/@gchudnov\/trapping-signals-in-docker-containers-7a57fdda7d86\">here<\/a>.<\/p>\n<h4>Difference between \u201cdocker stop\u201d, \u201cdocker rm\u201d and \u201cdocker kill\u201d<\/h4>\n<p>\u201cdocker stop\u201d \u2013 Sends SIGTERM to container, waits some time for process to handle it and then sends SIGKILL. Container filesystem remains intact.<br \/>\n\u201cdocker kill\u201d \u2013 Sends SIGKILL directly. Container filesystem remains intact.<br \/>\n\u201cdocker rm\u201d \u2013 Removes container filesystem. \u201cdocker rm -f\u201d will send SIGKILL and then remove container filesystem.<br \/>\nUsing \u201cdocker run\u201d with \u201c\u2013rm\u201d option will automatically remove containers including container filesystem when the container exits.<\/p>\n<p>When container exits without the container filesystem getting removed, we can still restart the container.<\/p>\n<h3>Container restart policy<\/h3>\n<p>Container restart policy controls the restart actions when Container exits. Following are the supported restart options:<\/p>\n<ul>\n<li>no \u2013 This is default. Containers do not get restarted when they exit.<\/li>\n<li>on-failure \u2013 Containers restart only when there is a failure exit code. Any exit code other than 0 is treated as failure.<\/li>\n<li>unless-stopped \u2013 Containers restart as long as it was not manually stopped by user.<\/li>\n<li>always \u2013 Always restart container irrespective of exit status.<\/li>\n<\/ul>\n<p>Following is an example of starting \u201csignaltest\u201d container with restart policy of \u201con-failure\u201d and retry count of 3. Retry count 3 is the number of restarts that will be done by Docker before giving up.<\/p>\n<p>docker run -d &#8211;name=signaltest &#8211;restart=on-failure:3 smakam\/signaltest:v1<\/p>\n<p>To show the restart happening, we can manually try to send signals to the container. In the \u201csignaltest\u201d example, signals SIGTERM, SIGINT and SIGKILL will cause non-zero exit code and SIGUSR1 will cause zero exit code. 1 thing to remember is that restart does not work if we stop the container or send signals using \u201cdocker kill\u201d. I think this is because there must be an explicit check added by Docker to prevent restart in these cases since the action is triggered by user.<br \/>\nLets send SIGINT to the container by passing the signal to the process. We can find the process id by doing \u201cps -eaf | grep signalexample\u201d in host machine.<\/p>\n<p>kill -s SIGINT<\/p>\n<p>Lets check the \u201cdocker ps\u201d output. We can see that the \u201ccreated\u201d time is 50 seconds. Uptime is less than a second because container restarted.<\/p>\n<p>$ docker ps<br \/>\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br \/>\nb867543b110c smakam\/signaltest:v1 &#8220;python signalexam&#8230;&#8221; 50 seconds ago Up Less than a second<\/p>\n<p>Following command shows the restart policy and the restart count for the running container. In this example, container restart happened once.<\/p>\n<p>$ docker inspect signaltest | grep -i -A 2 -B 2 restart<br \/>\n&#8220;Name&#8221;: &#8220;\/signaltest&#8221;,<br \/>\n&#8220;RestartCount&#8221;: 1,<br \/>\n&#8220;RestartPolicy&#8221;: {<br \/>\n&#8220;Name&#8221;: &#8220;on-failure&#8221;,<br \/>\n&#8220;MaximumRetryCount&#8221;: 3<\/p>\n<p>To illustrate that restart does not work on exit code 0, lets send SIGUSR1 to the container that will cause exit code 0.<\/p>\n<p>sudo kill -s SIGUSR1<\/p>\n<p>In this case, container exits, but it does not get restarted.<\/p>\n<p>Container restart does not work with \u201c\u2013rm\u201d option. This is because \u201c\u2013rm\u201d option causes container to be removed as soon as the container exit happens.<\/p>\n<h3>Container health check<\/h3>\n<p>It is possible that container does not exit but it not performing as per the requirement. Health check probes can be used to identify misbehaving containers and take action rather than waiting till the end when container dies. Health check probes are used to accomplish the specific task of checking container health. For a container like webserver, health check probe can be as simple as sending curl request to webserver port. By using container\u2019s health, we can restart the container if health check fails.<br \/>\nTo illustrate health check feature, I have used the container described <a href=\"https:\/\/hub.docker.com\/r\/effectivetrainings\/docker-health\/\">here<\/a>.<br \/>\nFollowing command starts the webserver container with health check capability enabled.<\/p>\n<p>docker run -p 8080:8080 -d &#8211;rm &#8211;name health-check &#8211;health-interval=1s &#8211;health-timeout=3s &#8211;health-retries=3 &#8211;health-cmd &#8220;curl -f http:\/\/localhost:8080\/health || exit 1&#8221; effectivetrainings\/docker-health<\/p>\n<p>Following are all parameters related to healthcheck:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/sreeninet.files.wordpress.com\/2017\/08\/container_healthcheck.png?w=604\" alt=\"container_healthcheck\" \/><\/p>\n<p>Following \u201cdocker ps\u201d output shows container health status:<\/p>\n<p>$ docker ps<br \/>\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br \/>\n947dad1c1412 effectivetrainings\/docker-health &#8220;java -jar \/app.jar&#8221; 28 seconds ago Up 26 seconds (healthy) 0.0.0.0:8080-&gt;8080\/tcp health-check<\/p>\n<p>This container has a backdoor approach to mark container health as unhealthy. Lets use the backdoor approach to mark container as unhealthy like below:<\/p>\n<p>curl &#8220;http:\/\/localhost:8080\/environment\/health?status=false&#8221;<\/p>\n<p>Now, lets check the \u201cdocker ps\u201d output. The container\u2019s health has now become unhealthy.<\/p>\n<p>$ docker ps<br \/>\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br \/>\n947dad1c1412 effectivetrainings\/docker-health &#8220;java -jar \/app.jar&#8221; 3 minutes ago Up 3 minutes (unhealthy) 0.0.0.0:8080-&gt;8080\/tcp health-check<\/p>\n<h3>Service restart with Swarm<\/h3>\n<p>Docker Swarm mode introduces a higher level of abstraction called Service and containers are part of the service. When we create a service, we specify the number of containers that needs to be part of the service using \u201creplicas\u201d parameter. Docker swarm will monitor the number of replicas and if any container dies, Swarm will create new container to keep the replica count as requested by the user.<br \/>\nBelow command can be used to create signal service with 2 container replicas:<\/p>\n<p>docker service create &#8211;name signaltest &#8211;replicas=2 smakam\/signaltest:v1<\/p>\n<p>Following command output shows the 2 containers that are part of \u201csignaltest\u201d service:<\/p>\n<p>$ docker service ps signaltest<br \/>\nID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS<br \/>\nvsgtopkkxi55 signaltest.1 smakam\/signaltest:v1 ubuntu Running Running 36 seconds ago<br \/>\ndbbm05w91wv7 signaltest.2 smakam\/signaltest:v1 ubuntu Running Running 36 seconds ago<\/p>\n<p>Following parameters control the container restart policy in a service:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sreeninet.files.wordpress.com\/2017\/08\/service_restart.png?w=637&amp;h=192\" alt=\"service_restart\" width=\"637\" height=\"192\" \/><br \/>\nLets start the \u201csignaltest\u201d service with restart-condition of \u201con-failure\u201d:<\/p>\n<p>docker service create &#8211;name signaltest &#8211;replicas=2 &#8211;restart-condition=on-failure &#8211;restart-delay=3s smakam\/signaltest:v1<\/p>\n<p>Remember that sending signal \u201cSIGTERM\u201d, \u201cSIGINT\u201d, \u201cSIGKILL\u201d causes non-zero container exit codes and sending \u201cSIGUSR1\u201d causes zero container exit code.<br \/>\nLets first send SIGTERM to 1 of the 2 containers:<\/p>\n<p>docker kill &#8211;signal=SIGTERM<\/p>\n<p>Following is the \u201csignaltest\u201d service output that shows the 3 containers including the one that has exited with non-zero status:<\/p>\n<p>$ docker service ps signaltest<br \/>\nID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS<br \/>\n35ndmu3jbpdb signaltest.1 smakam\/signaltest:v1 ubuntu Running Running 4 seconds ago<br \/>\nullnsqio5151 _ signaltest.1 smakam\/signaltest:v1 ubuntu Shutdown Failed 11 seconds ago &#8220;task: non-zero exit (15)&#8221;<br \/>\n2rfwgq0388mt signaltest.2 smakam\/signaltest:v1 ubuntu Running Running 49 seconds ago<\/p>\n<p>Following command sends SIGUSR1 signal to 1 of the containers which causes container to exit with status 0.<\/p>\n<p>docker kill &#8211;signal=SIGUSR1<\/p>\n<p>Following command shows that the container did not restart since the container exit code is 0.<\/p>\n<p>$ docker service ps signaltest<br \/>\nID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS<br \/>\n35ndmu3jbpdb signaltest.1 smakam\/signaltest:v1 ubuntu Running Running 52 seconds ago<br \/>\nullnsqio5151 _ signaltest.1 smakam\/signaltest:v1 ubuntu Shutdown Failed 59 seconds ago &#8220;task: non-zero exit (15)&#8221;<br \/>\n2rfwgq0388mt signaltest.2 smakam\/signaltest:v1 ubuntu Shutdown Complete 3 seconds ago<\/p>\n<p>$ docker service ls<br \/>\nID NAME MODE REPLICAS IMAGE PORTS<br \/>\nxs8lzbqlr69n signaltest replicated 1\/2 smakam\/signaltest:v1<\/p>\n<p>I don\u2019t see a real need to change the default Swarm service restart policy from \u201cany\u201d.<\/p>\n<h3>Service health check<\/h3>\n<p>In the previous sections, we saw how to use container health check with \u201ceffectivetrainings\/docker-health\u201d container. Even though we could detect the container as unhealthy, we could not restart the container automatically. For standalone containers, Docker does not have native integration to restart the container on health check failure though we can achieve the same using Docker events and a script. Health check is better integrated with Swarm. With health check integrated to Swarm, when a container in a service is unhealthy, Swarm automatically shuts down the unhealthy container and starts a new container to maintain the container count as specified in the replica count of a service.<\/p>\n<p>\u201cdocker service\u201d command provides following options for health check and associated behavior.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sreeninet.files.wordpress.com\/2017\/08\/service_health_check.png?w=637&amp;h=194\" alt=\"service_health_check\" width=\"637\" height=\"194\" \/><\/p>\n<p>Lets create \u201cswarmhealth\u201d service with 2 replicas of \u201cdocker-health\u201d containers.<\/p>\n<p>docker service create &#8211;name swarmhealth &#8211;replicas 2 -p 8080:8080 &#8211;health-interval=2s &#8211;health-timeout=10s &#8211;health-retries=10 &#8211;health-cmd &#8220;curl -f http:\/\/localhost:8080\/health || exit 1&#8221; effectivetrainings\/docker-health<\/p>\n<p>Following output shows the \u201cswarmhealth\u201d service output and the 2 healthy containers:<\/p>\n<p>$ docker service ps swarmhealth<br \/>\nID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS<br \/>\njg8d78inw97n swarmhealth.1 effectivetrainings\/docker-health:latest ubuntu Running Running 21 seconds ago<br \/>\nl3fdz5awv4u0 swarmhealth.2 effectivetrainings\/docker-health:latest ubuntu Running Running 19 seconds ago<\/p>\n<p>$ docker ps<br \/>\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br \/>\nd9b1f1b0a9b0 effectivetrainings\/docker-health:latest &#8220;java -jar \/app.jar&#8221; About a minute ago Up About a minute (healthy) swarmhealth.1.jg8d78inw97nmmbdtjzrscg1q<br \/>\nbb15bfc6e588 effectivetrainings\/docker-health:latest &#8220;java -jar \/app.jar&#8221; About a minute ago Up About a minute (healthy) swarmhealth.2.l3fdz5awv4u045g2xiyrbpe2u<\/p>\n<p>Lets mark 1 of the container unhealthy using backdoor command:<\/p>\n<p>curl &#8220;http:\/\/:8080\/environment\/health?status=false&#8221;<\/p>\n<p>Following output shows that 1 of the containers that has been shutdown which is the unhealthy container and 2 more running replicas. 1 of the replicas got restarted after the other container became unhealthy.<\/p>\n<p>$ docker service ps swarmhealth<br \/>\nID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS<br \/>\nixxvzyuyqmcq swarmhealth.1 effectivetrainings\/docker-health:latest ubuntu Running Running 4 seconds ago<br \/>\njg8d78inw97n _ swarmhealth.1 effectivetrainings\/docker-health:latest ubuntu Shutdown Failed 23 seconds ago &#8220;task: non-zero exit (143): do\u2026&#8221;<br \/>\nl3fdz5awv4u0 swarmhealth.2 effectivetrainings\/docker-health:latest ubuntu Running Running 5 minutes ago<\/p>\n<h3>Service upgrade and rollback<\/h3>\n<p>When we have new versions of service to be updated without taking service downtime, Docker provides many controls to do the upgrade and rollback. For example, we can control parameters like number of tasks to upgrade at a single time, actions on upgrade failure, delay between task upgrades etc. This helps us achieve release patterns like Blue green and Canary deployment patterns.<\/p>\n<p>Following options are provided by Docker in \u201cdocker service\u201d command to control rolling upgrade and rollback.<\/p>\n<p>Rolling upgrade:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sreeninet.files.wordpress.com\/2017\/08\/service_upgrade.png?w=474&amp;h=180\" alt=\"service_upgrade\" width=\"474\" height=\"180\" \/><\/p>\n<p>Rollback:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/sreeninet.files.wordpress.com\/2017\/08\/service_rollback.png?w=604\" alt=\"service_rollback\" \/><\/p>\n<p>To illustrate service upgrade, I have a simple python webserver program running as container.<br \/>\nFollowing is the Python program:<\/p>\n<p>#!\/usr\/bin\/python<\/p>\n<p>import sys<br \/>\nfrom BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer<br \/>\nimport urlparse<br \/>\nimport json<\/p>\n<p>class GetHandler(BaseHTTPRequestHandler):<\/p>\n<p>def do_GET(self):<br \/>\nmessage = &#8220;You are using version 1n&#8221;<br \/>\nself.send_response(200)<br \/>\nself.end_headers()<br \/>\nself.wfile.write(message)<br \/>\nreturn<\/p>\n<p>def main():<br \/>\nserver = HTTPServer((&#8221;, 8000), GetHandler)<br \/>\nprint &#8216;Starting server at http:\/\/localhost:8000&#8217;<br \/>\nserver.serve_forever()<\/p>\n<p># This is the standard boilerplate that calls the main() function.<br \/>\nif __name__ == &#8216;__main__&#8217;:<br \/>\nmain()<\/p>\n<p>This is the Dockerfile to create the Container:<\/p>\n<p>FROM python:2.7<br \/>\nCOPY .\/webserver.py .\/webserver.py<br \/>\nENTRYPOINT [&#8220;python&#8221;, &#8220;webserver.py&#8221;]<\/p>\n<p>I have 2 versions of Container, smakam\/webserver:v1 and smakam\/webserver:v2. The only difference is the message output that either shows \u201cYou are using version 1\u201d or \u201cYou are using version 2\u201d.<\/p>\n<p>Lets create version 1 of the service with 2 replicas:<\/p>\n<p>docker service create &#8211;name webserver &#8211;replicas=2 -p 8000:8000 smakam\/webserver:v1<\/p>\n<p>We can access the service using script. The service request will get load balanced between the 2 replicas.<\/p>\n<p>while true; do curl -s &#8220;localhost:8000&#8221;;sleep 1;done<\/p>\n<p>Following is the service request output that shows we are using version 1 of the service:<\/p>\n<p>You are using version 1<br \/>\nYou are using version 1<br \/>\nYou are using version 1<\/p>\n<p>Lets upgrade to version 2 of the web service. Since we have specified update-delay of 3 seconds, there will be a 3 second gap between upgrades of 2 replicas. Since the \u201cupdate-parallelism\u201d default is 1, only 1 task will be upgraded at 1 time.<\/p>\n<p>docker service update &#8211;update-delay=3s &#8211;image=smakam\/webserver:v2 webserver<\/p>\n<p>Following is the service request output output that shows the request slowly getting migrated to version 2 as the upgrade happens 1 replica at a time.<\/p>\n<p>You are using version 1<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 2<br \/>\nYou are using version 2<\/p>\n<p>Now, lets rollback to version 1 of the webserver:<\/p>\n<p>docker service update &#8211;rollback webserver<\/p>\n<p>Following is the service request output output that shows the request slowly getting downgraded from version 2 to version 1.<\/p>\n<p>You are using version 2<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 2<br \/>\nYou are using version 1<br \/>\nYou are using version 1<\/p>\n<p>Please let me know your feedback and if you want to see more details on any specific topic related to this. I have put the code associated with this blog <a href=\"https:\/\/github.com\/smakam\/docker\/tree\/master\/container_lifecycle\">here<\/a>. The containers used in this blog(smakam\/signaltest, smakam\/webserver) are in Docker hub.<\/p>\n<h3>References<\/h3>\n<ul>\n<li><a href=\"https:\/\/medium.com\/@lherrera\/life-and-death-of-a-container-146dfc62f808\">Blog \u2013 Life and Death of a Container<\/a><\/li>\n<li><a href=\"https:\/\/medium.com\/@gchudnov\/trapping-signals-in-docker-containers-7a57fdda7d86\">Blog \u2013 Trapping signals in Docker containers<\/a><\/li>\n<li><a href=\"https:\/\/www.ctl.io\/developers\/blog\/post\/gracefully-stopping-docker-containers\/\">Blog \u2013 Gracefully stopping Docker containers<\/a><\/li>\n<li><a href=\"https:\/\/docs.docker.com\/engine\/reference\/run\/#restart-policies-restart\">Docker container restart policies<\/a><\/li>\n<li><a href=\"https:\/\/docs.docker.com\/engine\/reference\/run\/#healthcheck\">Docker container healthcheck<\/a><\/li>\n<li><a href=\"https:\/\/docs.docker.com\/engine\/swarm\/swarm-tutorial\/rolling-update\/\">Docker service rolling update<\/a><\/li>\n<\/ul>\n<p><a href=\"https:\/\/sreeninet.wordpress.com\/2017\/08\/15\/docker-features-for-handling-containers-death-and-resurrection\/\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Docker containers provides an isolated sandbox for the containerized program to execute. One-shot containers accomplishes a particular task and stops. Long running containers runs for an indefinite period till it either gets stopped by the user or when the root process inside container crashes. It is necessary to gracefully handle container\u2019s death and to make &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.appservgrid.com\/paw93\/index.php\/2018\/10\/16\/docker-features-for-handling-containers-death-and-resurrection\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Docker features for handling Container\u2019s death and resurrection&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-291","post","type-post","status-publish","format-standard","hentry","category-docker"],"_links":{"self":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/291","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/comments?post=291"}],"version-history":[{"count":1,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/291\/revisions"}],"predecessor-version":[{"id":341,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/posts\/291\/revisions\/341"}],"wp:attachment":[{"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/media?parent=291"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/categories?post=291"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw93\/index.php\/wp-json\/wp\/v2\/tags?post=291"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}