When using shared_memory in a subprocess the resource_tracker needs to
be inherited from the parent process. If it is not then each
subprocess erroneously gets its own resource_tracker.
This statement is quite flawed given the current implementations of both ResourceTracker
and SharedMemory
. The former is implemented as a separate python process that communicates with the process that started it (i.e. the process that created the shared memory object(s)) via a pipe. The resource tracker has the read end of the pipe, while the process creating the shared memory objects gets the write end of it. So, any time the starting process creates a SharedMemory
object, it sends, via the pipe, a message to the resource tracker to register
the created resource. Similarly, if a resource needs to removed, the starting process will use the pipe again to send an unregister
message. As result, the only way a child process could truly inherit the resource tracker of its parent is if it sent messages directly to the resource tracker using the write end of the pipe (which it should have access to). However, since the current implementation of SharedMemory
creates a resource tracker even when a process is only consuming an already created shared memory object, your child processes would have to communicate with two separate resource trackers: the one started by their parent (via the same pipe), and the one that gets started when they instantiate a SharedMemory
object for the first time. With that out of the way, let's tackle your questions:
I don't instantiate a resource_tracker anywhere in my code. What does it mean for a resource_tracker to be inherited?
First, you do not instantiate a resource tracker; one is instantiated for you when you instantiate a SharedMemory
object for the first time. And currently, it does not matter whether or not you are producing or consuming a shared memory object. A resource tracker is always created for the process that instantiated the shared memory objects.
Second, it's really not a thing in the current implementation to inherit a resource tracker. Again, consuming processes shouldn't worry about the life cycle of shared memory objects. All they have to worry about is to make sure that the object actually exists. They can do this by handling a FileNotFoundError
or OSError
exception. If the current implementation of SharedMemory
was not buggy, when consuming processes are done with a resource, all they need to do is call SharedMemory.close
and move on to something else.
How do I instantiate the resource_tracker in the main process prior to creating the new subprocesses so that the resource_tracker gets inherited by the subprocesses?
I think the issue here is that your design is flipped. You should have your main process create the shared memory object and let the child processes consume it. The idea behind using shared memory objects is so that you can have multiple separate processes using the same memory chunks, which should in turn limit the amount of resources used by your parallel program. But the code in the linked SO post is doing the reverse. Since shared memory objects are kernel persistent resources, it makes sense to have as few of them as possible. So, if you employ a "one producer, multiple consumers" design, you can have your main process create the shared memory object along with its associated resource tracker, and then you let the child processes consume the memory. In this scenario, you could get some work done in the child processes without having to worry about the resource trackers associated with them. But just make sure that the child processes don't unlink the shared memory object before the parent process gets around to doing it. Better yet, if the fix in the bug report gets implemented making it unnecessary for consuming processes to spawn resource trackers, you can be confident in that your main process will be the only entity unlinking the shared memory object.
In sum, your child processes are not going to inherit their parent's resource tracker, as far as the current implementation goes. If those child processes end up actually creating shared memory objects, they will get their own resource trackers. But if efficiency is the goal, you would want your main process to create the shared memory object(s) that your child processes will then consume. In such a scenario, your main process, via its associated resource tracker, will be in charge of the cleanup step. And, if the fix is implemented, you can always be safe in assuming that only the main process will be unlinking the resources.