9

I am running multiple named docker containers (200+) on my VM Host. I have a manager script/code that is supposed to manage the containers from the host. I would like to know if there is any event-based mechanism to get notified when a container stops/fails. So that I can restart the stopped container.

One solution I could think of is doing a periodic docker inspect and looking at State.Pid or State.Running to confirm the status.

But,instead of periodic polling, it would be better if the manager is notified with pid/name when a container fails so that, the particular container alone can be restarted.

On a general note, are there ways to programmatically monitor the status of a process from a different process that is not the parent ?

Bryan
  • 9,644
  • 1
  • 44
  • 70
Nataraj
  • 359
  • 7
  • 15

3 Answers3

8

Look at docker events - there is an event for container 'die'.

There is also an http interface to get the same information programmatically - see here

You may want to do a web search for 'docker orchestration' - many projects springing up to manage multiple containers in the way you describe.

Bryan
  • 9,644
  • 1
  • 44
  • 70
3

If you just want to restart the containers why don't you use a restart policy?

docker run --restart=always IMAGE
Javier Castellanos
  • 7,358
  • 2
  • 11
  • 18
  • 1
    Thanks for the suggestion. 'Restart' is one of the use cases. Ideally I should be able to make more decisions than just restart from the manager program. – Nataraj Nov 13 '14 at 09:53
0

psutil seems to do what you want http://pypi.python.org/pypi/psutil From Python

import psutil psutil.pids() [1, 2, 3, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 51, 52, 53, 54, 56, 57, 58, 59, 61, 62, 63, 64, 65, 66, 67, 69, 70, 71, 72, 73, 74, 76, 77, 78, 79, 80, 81, 82, 94, 97, 98, 117, 118, 137, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 200, 201, 210, 211, 213, 214, 229, 230, 411, 416, 419, 526, 527, 542, 543, 544, 545, 555, 569, 625, 709, 714, 756, 781, 782, 796, 862, 863, 864, 869, 914, 944, 945, 948, 954, 996, 1052, 1061, 1064, 1067, 1170, 1174, 1179, 1180, 1183, 1234, 1240, 1241, 1245, 1323, 1328, 1340, 1351, 1354, 1390, 1408, 1457, 1507, 1531, 1631, 1662, 1933, 1972, 1981, 1987, 1989, 1993, 2346, 2348, 2413, 2422, 2429, 2442, 2445, 2449, 2451, 2457, 2461, 2471, 2489, 2490, 2491, 2493, 2497, 2501, 2505, 2509, 2513, 2524, 2546, 2549, 2551, 2554, 2563, 2567, 2572, 2573, 2576, 2578, 2586, 2595, 2598, 2624, 2644, 2655, 2665, 2667, 2687, 2689, 2693, 2699, 2744, 2752, 2785, 2789, 2794, 2798, 2804, 2817, 2820, 2830, 2838, 2856, 2862, 2864, 2886, 2903, 2935, 2972, 2985, 2986, 3138, 3164, 3211, 3368, 3371, 3557, 4125, 4352, 4443, 4444, 4743, 4818, 4819, 4840, 4841, 4844, 4845, 4866, 4876, 6142, 6363, 6366, 6372, 6378, 6385, 6391, 6452, 6518, 6524, 6531, 6555, 6558, 6601] p = psutil.Process(2862) p.status() 'sleeping'

user2915097
  • 24,082
  • 5
  • 47
  • 53