Recently I’ve been busy with making a lot o my graphs a more efficient using Python. I’ve been using one to put items at a specific position in a table giving 3 inputs: the x index (called obj_ind), the y index (called verd_ind) and the value (obj_tel). I generated a list of zeroes first (tel_list), with a size based on the biggest index values. Then I had a for loop put specific items at specific positions:
obj_ind=IN[0]
verd_ind=IN[1]
obj_tel=IN[2]
obj_list=range(max(obj_ind)+1)
verd_list=range(max(verd_ind)+1)
tel_list=[[0 for o in verd_list] for v in obj_list]
for x,y,i in zip(obj_ind,verd_ind,obj_tel):
tel_list[x][y]=i
OUT=tel_list
This has been working fine for me.
Recently I thought that the way I’ve been making the list of zeroes however, is rather inefficient, so I decided to change it:
obj_ind=IN[0]
verd_ind=IN[1]
obj_tel=IN[2]
tel_list=[[0] * (max(verd_ind)+1)] * (max(obj_ind)+1)
for x,y,i in zip(obj_ind,verd_ind,obj_tel):
tel_list[x][y]=i
OUT=tel_list
It generates the same list of zeroes, only using a different method, besides that the Python code hasn’t changed anything.
Yet I’m still getting different results:
I can not figure out what’s causing the difference here. I hope someone can help me understand.
Thanks in advance!
Edit: after playing around with this method, it turns out that, when creating a list like this, when changing an element of one sublist, it will do so for every sublist. I still don’t know why it treats this list differently, but I understand what’s happening.
# Load the Python Standard and DesignScript Libraries
import sys
import clr
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *
number = 3
l = [[0]*number]*4
#n is the number of items within the sublists
#a is the index within the flatten list
#b is the item to be replaced
def repl(l, n = 1, a = 1, b = 0):
flat = [i for sub in l for i in sub]
flat[a] = b
return [flat[i : i + n] for i in range(0, len(flat), n)]
OUT = repl(l, number, 0, 1)
I do agree.
I’m currently trying to figure out Python in general, hence I’m interested in finding out why this specific method doesn’t work. It seems like a fairly obvious solution.
I am seeing some interesting results on the timing aspect as data sets expand in Python. The results literally flip flop at a certain point, which is weird… might just be my bad test code but the TimeIt module is definitely a good one to know about. Thanks for making me question it!
As far as why, I can speculate, but a Python expert is likely a better source.
There are a LOT of unique things about lists which come into play, one of which is that sometimes what looks like a normal list isn’t, and is instead a collection of the same list pulled from the same memory N times - why write it down again if you can just refer to it a second time?
To test this you can add list1=list0 in at line 2 of your code, you’ll find that returning list1 will do the same thing, because you haven’t made anything new, just given it a second name (or in your prior case you asked for the same thing N times over). Changing the object itself changes all instances of the object, rather than creating a new type.
When we use lstB = [[0] * 5] * 3
Python store each sub-list at same address memory (same object), so when we set a value, we set all sub-list (Python lists are mutable)
I thought it’d be doing something like that.
But I do find it strange that it straight up stores the same item several times, without allowing you to change a single element of it.
I don’t see how lists working as such would ever be an advantage.
Let’s say you were going to place work groups on an office floor plate. Each group contains a given number of workstations, all of which are exactly the same as a furniture grouping, containing the desk, monitor, dock, monitor cables, chair filing cabinet, power strip… etc… We’re going to store all the meta data and all the geometry for everything as we need it all. By the time you’re done defining all the things which go into each workstation you might have 250 mb of data; with 64 desk (a small number) you’d effectively fill 16gb of ram; at 8 desks per workstation you just used the ‘standard’ system requirements for Revit, leaving none for the OS or using Revit, so you’re already paging heavily.
By reusing the same segment of memory you can more easily account for 1000 desks before you run into paging issues. We haven’t yet accounted for instance parameters (ie: desk number, occupant, etc.) but we still have the memory for all that data.
I don’t really see how not applying the same list several times via a for loop wouldn’t be more efficient, arguably even easier to do.
Biggest problem is still that I can’t easily turn this into something which I can’t effectively treat it like instance parameters.
import clr
import sys
import re
import System
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *
import Autodesk.DesignScript.Geometry as DS
my_path = System.Environment.GetFolderPath(System.Environment.SpecialFolder.MyDocuments)
pf_path = System.Environment.GetFolderPath(System.Environment.SpecialFolder.ProgramFilesX86)
reDir = System.IO.DirectoryInfo(re.__file__)
path_py3_lib = reDir.Parent.Parent.FullName
sys.path.append(path_py3_lib + r'\Lib\site-packages')
import timeit
import functools
import numpy as np
from System import Array
def make_array_numpy(x=1, y=1):
a = np.zeros((x, y), dtype=int)
# set a value
a[1,1] = 99
return a
def make_array_list_comprenhension(x=1, y=1):
a = [[0 for i in range(x)] for j in range(y)]
# set a value
a[1][1] = 99
return a
def make_Netarray(x=1, y=1):
a = Array.CreateInstance(System.Int32, x, y)
# set a value
a[1,1] = 99
return a
#test with numpy COMPATIBLE ONLY WITH CPYTHON3
t1 = timeit.Timer(functools.partial(make_array_numpy, 500, 50))
timeT1 = min(t1.repeat(number=10000, repeat=5))
result1 = ['make_array_numpy', 'time for 10000 (repeat 5) : %.4f s' %(timeT1), ' ↓↓↓ out for 2D array shape (2x2) ↓↓↓', make_array_numpy(2,2)]
#test with list comprenhension
t2 = timeit.Timer(functools.partial(make_array_list_comprenhension, 500, 50))
timeT2 = min(t2.repeat(number=10000, repeat=5))
result2 = ['make_array_list_comprenhension', 'time for 10000 (repeat 5) : %.4f s' %(timeT2), ' ↓↓↓ out for 2D array shape (2x2) ↓↓↓', make_array_list_comprenhension(2,2)]
#test with Net Array
t3 = timeit.Timer(functools.partial(make_Netarray, 500, 50))
timeT3 = min(t3.repeat(number=10000, repeat=5))
# out for transform Net Array to 2D list
arr = make_Netarray(2,2)
outlist_arr = [[arr[i,j] for i in range(arr.GetLength(0))] for j in range(arr.GetLength(1))]
result3 = ['make_Netarray', 'time for 10000 (repeat 5) : %.4f s' %(timeT3), ' ↓↓↓ out for 2D array shape (2x2) ↓↓↓', outlist_arr]
OUT = result1, result2, result3,
Net Array is a good alternative (and compatible both Python engine)