Hi everyone.
This question has been discussed several times on this forum, but it doesn’t seem to be completely resolved, so I’d like to ask it again.
I understand that when I create a geometry object in my dynamo python node, I need a Dispose to release it.
However, when running in multithread, there is a phenomenon that even if I dispose, it is not completely destroyed.
Check the script below.
This code repeats Cuboid generation and Dispose.
If I run this code in a thread, it continues to consume virtual memory even though I’m disposing it.
Dispose is effective because without Dispose, more memory is consumed.
However, the effect is insufficient.
In the scripts used in business, a large amount of Curve, Suface, etc. are processed in addition to Cuboid.
As a result, the memory used may exceed 50GB without being released as expected, and Revit will pop up a warning.
Is there any workaround?
This behavior occurs in Dynamo for Revit and Dynamo sandbox.
Python occurs in both IronPython and CPython.
Also, the larger the number of parallels, the larger the amount of memory leak.
# Load the Python Standard and DesignScript Libraries
import sys
import clr
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *
from multiprocessing.pool import ThreadPool as Pool
import multiprocessing
def test():
for i in range(1000):
a = Cuboid.ByLengths(5, 5, 5)
a.Dispose()
def mapRun():
cpuCount = multiprocessing.cpu_count()
pool = Pool(cpuCount)
for i in range(64):
pool.apply_async(test, )
pool.close()
pool.join()
def singleRun():
for i in range(64):
test()
for i in range(0, 100):
mapRun()
#singleRun()
OUT = 0
Such a phenomenon does not occur when running in a single task on the main thread.
If I use singleRun() instead of mapRun() in the test program above, it’s fine.
It happens both in Dynamo for revit and in dynamo sandbox.
I tried on Revit 2021,2022,2023.
Hi @at_yan this took a while to figure out, but I believe the issue is actually in the way you create multiple pools instead of a single large pool.
I’ve made two changes here that reduce the leak from 50 megs to about 0.1 megs.
I’ve used map instead of apply_async, though if order is unimportant you can try imap_unordered -this means we don’t need to retrieve multiple result objects and there is no looping over the pool.
I’ve created only 1 pool, not 100 of them - this is the more important change. It’s unclear why this leads to a leak, but there are multiple questions about it on stackoverflow, I can only blame something in the Cpython implementation - because when I isolate libG use in parallel in c# alone I see much smaller leaks than when using this multiprocessing module.
My final advice is not use threadpool from multiprocessing (which is really made for multiprocessing, not multithreading) - it’s weird and seems untested in many versions of CPython. I would switch to concurrent.futures — Launching parallel tasks — Python 3.11.1 documentation as it’s more modern and idiomatic.
# Load the Python Standard and DesignScript Libraries
import sys
import clr
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *
from multiprocessing.pool import ThreadPool as Pool
import multiprocessing
def test(a):
for i in range(1000):
a = Cuboid.ByLengths(5, 5, 5)
a.Dispose()
def mapRun():
cpuCount = multiprocessing.cpu_count()
pool = Pool(cpuCount)
pool.map(test, range(6400),)
pool.close()
pool.join()
mapRun()
OUT = 0
Thank you for investigating.
Indeed, I have confirmed that creating a single large pool has the effect of greatly reducing memory consumption.
It is a strange phenomenon that is hard to understand why.
However, the structure of the original test script mimics the structure of my real business script.
In other words, it is difficult to completely integrate them into one large pool.
But, I got a hint from you, so I’ll continue the verification a little more.
well another solution is to move to using parallel structures like parallel.foreach in c# directly. I saw MUCH better performance with full cpu utilization there instead of the 40% util in python.