Query to users/developers - items or lists out, for item in?

I’ve been working on my custom package (Crumple) lately, and wanted to garner people’s opinions/expectations RE node input/output behavior regarding list structure and objects.

As many will know, in Python you have to force data into a list in order to iterate over it (do one process to each thing in a list), even if dealing with just one object. The most common way I and others seem to deal with this is to use a function to force an item into a list on its own if the node is only given one object, then the functions iterate once over it.

Currently I return this outcome as a list (of one item). My thinking here is that this will let developers also have scripts which can assume the output of said node will always be a list, and be indexable and workable as a list in Dynamo once it leave the custom node. Anyone that has looked into Crumple nodes will probably recognize my list helper functions:

# Define list/unwrap list functions
def tolist(input):
    result = input if isinstance(input, list) else [input]
    return result

def uwlist(input):
    result = input if isinstance(input, list) else [input]
    return UnwrapElement(result)

I’ve been toying with the idea of checking the outputs of my node and instead returning an item (where relevant) versus a list if the user only provided an item originally, using a simple length check where if length = 1, return list[0] instead of list.

On the one hand this might make my nodes behave more as users would expect them to (and more akin to the core Dynamo nodes themselves) if they didn’t know how iteration works behind the scenes, but on the other hand I worry that this might just be bad practice as on average some scripts might be giving an item in some cases, and lists in others, so predictable output structure is superior. I do have a node in my package to force list promotion (List.Force), but it feels a pain to force users to engage with an extra node each time they get an output if so.

I’m aware there are some pretty complex ways (recursive map/lambda/functions such as this) to make functions work across unpredictable list structures, but in the interest of keeping my nodes simple/approachable I would prefer not to delve into that space. My core goal with my package was always to lay down each node as an educational piece, so if I go down this route I suspect that goal would be lost. I appreciate the package fulfils more utility than that for most of its users.

I suspect I’ve taken the right path of forcing list output for objects, but wanted to get some insight from others who have likely faced this before or have more advanced development experience than myself (a lot of people here!).

Any thoughts from others on the forums on this topic? I’m sticking with list outputs for now, so if you use my package have no fear!

I think your current thought process is correct.
Stay the path legend!

1 Like

I would utilize the feature of custom nodes to return the same data structure out as you had coming in.

Look at the behavior of ‘Point.ByCoordinates’ as an example (or pretty much any standard node). Assuming the default value for Y and Z if you provide:

  • 1, you get one point with no list structure
  • [1], you get a list with one point in it.
  • [1,2] you receive a list with two points in it.

There is no reason to not do this in the context of a custom node. In the context of ‘in graph Python’ you certainly need to enforce list structure in the input, but not usually once you are in the DYF environment. Nodes which enforce a list structure or otherwise alter the list structure are a common reason for ‘8 level deep’ list structures.

1 Like

Thanks @jacob.small Yes that was my thinking too, and it would lend itself more naturally to lacing/levels I imagine. It might throw a spanner in people used to my nodes behaving that way but I’m erring that way for the reasoning you’ve noted. I don’t believe many Python package work this way except for Clockwork (which has a custom node at the back end of all their nodes for this), but just wanted to sanity check it wasn’t unorthodox logic.

My thinking here is that maybe for more generic nodes it suits, then for others I could include a boolean for forcing of lists where having a predictable data shape is more suitable on the back end for predictable structures in the event lists are expected onwards regardless of one/many going in.

In a few cases I’ve actually done away with lists/levels where list structure is that crucial, using a manually enforced replication guide node at the front end - mainly for family document processing nodes where one/many can be difficult for users to get right. This is my code if anyone is curious - there are likely shorter/easier ways to write this code, but effectively this one ensures all branches given to the custom node are lists of lists to the list length of the primary input:

# Boilerplate text
import clr

clr.AddReference("RevitAPI")
import Autodesk 
from Autodesk.Revit.DB import Document

# Define list functions
def tolist(input):
    result = input if isinstance(input, list) else [input]
    return result

def toPaddedListOfLists(input, maxLen):
    # Force object to list
    input = tolist(input)
    # If not list of lists, make list of lists
    if not isinstance(input[0], list):
	    input = [input]
    # Check item count
    lenCheck = len(input)
    if lenCheck < maxLen:
    	result = []
    	for i in range(0, maxLen - lenCheck):
    		input.append(input[-1])
    return input

# Get family documents only
primary = tolist(IN[0])

# Length of family documents
maxLength = len(primary)

# Establish the core list
outputs = [primary]

# For each other input...
for i in (IN[1],IN[2],IN[3],IN[4],IN[5]):
	# If provided...
	if i:
		# Pad and append
		output = toPaddedListOfLists(i, maxLength)
		outputs.append(output)
	# Otherwise, append None
	else:
		outputs.append(None)

# Preparing output to Dynamo
OUT = outputs

Thanks for the encouragement also @pyXam. My thinking is if I do change it, it will be need to be obvious to the end users used to working with Crumple and other Python packages.

1 Like

Actually… the family document processing is a good example of an exception to the rule.

Say you have a node which opens a document at a path, adds a parameter to it given a shared parameter name, group type, etc… In that case you wouldn’t want to use lacing to process each of the parameter data sets, but you would on the file path. By treating the document individually and all of the other inputs as a list you can ensure the document is opened (or switched to, though passing a document will be less efficient than opening once and executing in bulk) only once and then the list of parameter names/group type/etc. applied to each in the one open…

The ‘external document’ nodes typically want to work this way, as do a few other ‘groups’ of nodes such as external app interop (Excel, Word, PowerPoint, etc.), but the ‘main document’ nodes typically don’t unless the host application’s API expects a list of objects (Revit’s sketch based objects; Civil 3D’s block creation; etc.).

I wish there was ‘one simple rule’ for this. List enforcement strategies is a good set of tools to have - loading them from your own .py can help keep code simple as you’re missing one (returnListOrSingleItem).

Also if you’re going through this type of review/overhaul it might be a good time to switch everything you can to CPython3/IronPython3 or even zero touch. Those are big lifts of tiny approached systematically but now is a good time to start.

As an example I am not sure how well ‘isinstance’ works in some cases using CPython and it’s less useful overall as it only does one thing. Personally I have been using something like IN[0].__class__ == class instead (which also helps for input validation as if that doesn’t align you can use sys.exit(msg) to return a customized error message and stop processing early in the code.

2 Likes

Yes currently i still take the risky/rather unorrhodox familydoc open > do things > passthrough > close docs route to allow multiple steps in any order and my list forcing means the structure does away with potential lacing patterns under assumption its always one/many docs on equivalent list counts for types/params etc. All my nodes will generally have null protection to avoid error outs to the point where the familydocs always make it through to close at least.

At this point I’ve moreorless decided that Crumple will not switch to CP3 due to some of its associated quirks. It will likely remain IP2.7 until I learn more about ZT and can pivot enough to go ZT/CP3 hybrid. Think how Rhythm/Archilab slowly switched. I appreciate at some point IP2.7 will not be supported at all even if just due to Net 8, so timing may be awkward.

When you’re ready to start learning let me know. Happy to set up an ‘intro’ session for you. :slight_smile:

2 Likes

Just to provide the outcome I went with for anyone else, my resolution was as follows:

  • For a core/primary input, check if it was initially an object
  • From there, turn it to a list if needed for iteration
  • On processing the output, return the output(s) to objects if the initial input was an object

It works quite well I think as you can still send in a list of one object to preserver list structure.

You can find the outcome in the latest build of Crumple, but a sample is below:

# Made by Gavin Crump
# Free for use
# BIM Guru, www.bimguru.com.au

# Boilerplate text
import clr

# Define list/unwrap list functions
def uwlist(input):
    result = input if isinstance(input, list) else [input]
    return UnwrapElement(result)

def objOrList(input, initial = IN[0]):
	if isinstance(initial, list):
		return input
	else:
		return input[0]

# Preparing input from dynamo to revit
sheet_list = uwlist(IN[0])
sheet_revs = []

# Get revisions per sheet
for s in sheet_list:
	ids = s.GetAllRevisionIds()
	revs = [s.Document.GetElement(r) for r in ids]
	sheet_revs.append(revs)

# Preparing output to Dynamo
OUT = objOrList(sheet_revs)
1 Like

But what if I feed it a list of lists of lists of objects and lists? :joy:

Seriously though - good job. Was this configured for CPython, IronPython, or both?

1 Like

Currently I’ve had to ignore CP3 really, there’s a few too many ‘CP3 doesn’t work that way’ scenarios for me to stomach a full overhaul (interfaces, some enums and calls that are handled differently), and I know a lot of my package users are still pre 2022 as well so a full switch from IP2.7 is probably not on cards for 1 more year at this point. I also live on in hope of integration of IP3 in future for an easier switch… At the end of the day my package is intended as educational, so if each node becomes a hydra of exceptions for CP3 quirks I worry it will become illegible for new users getting their head around Python generally. If I ever get time to learn ZT properly I’d likely switch lanes to that but unsure if my future time will allow it versus me focusing on C# and app dev more generally (dynamo > pyrevit > appdev).

Funnily enough a couple of the structure dependent nodes have internal nodes to force some inputs to mirror parallel structures. Moreorless how I built my family nodes, the input trees adjacent to the familyDoc inputs get grafted into a list of lists if they’re a single list or object to ensure a 1 for 1 structure between them. I’m fairly sure it will work with levels as well for the most part if people need to work that way.

I appreciate the educational aspect more than most. But at some point we need to get past IP2 as it isn’t secure - at this point teaching straight CP or IP3 would be better as it shows stuff in a secure environment. After watching the XZUtils train wreck over the last week I might be a bit more ‘on the edge’ than I should be around IP2 though…

1 Like

I get that, but it’s just not something I have time for as CP3 feels an unnatural transition to me given the ironpython flavour legacy of Dynamo (it’s more a natural transition to C# concepts I think?).

Most educational resources out there conflict with CP3 quirks in a lot of areas so I have that as a hangover of my package also. I’m more likely to switch to IP3 dependency than anything else at this point once 2021 drops from what I’m willing to claim as supporting in my nodes.

At this point part of me has to selfishly consider the degree of ‘I tried X and it didn’t work’ queries I get - I’m already inundated with them to date as is unfortunately. Pivoting to CP3 would likely give me more, I see some fairly curly workarounds on the forum to handle CP3 that even for me are a bit abstract.

To me using IP2.7 is a true security risk if the user is not careful and runs arbitrarily acquired code, and yes in time Autodesk probably wont even be able to provide that degree of inroading - I appreciate they could have just said sorry no IP2.7, time to modernize. By that point I will have switched to one of CP3, IP3 or ZT I suspect - or sunsetted Crumple as a legacy project at the worst. Given the unmanaged package chaff present on the manager I don’t expect having a legacy package living behind will be of concern, if not just left on my git.

I plan to begin migrating some of the Crumple package to CP3 in the interim once I’ve left behind 2022 (once I get to 2023 as my lowest supported build - this is based on my own firm I work at and their own timing). Ill begin with the lowest hanging fruit nodes that I know wont be problematic.

I’ve got some life admin coming up so whatever I choose it will have to be low maintenance most likely for me. House hunting as of 2 weeks from now, wedding in October and probably aiming towards a crumpet in 2025/6 if the stars align… I’ll be sure to tap into whatever help I feel I need along the way though if I hit a road block in the avenues.

Actually the risk is just having it there in a way where it can be run. The concerning exploit won’t even come from someone running a malicious script in the engine (that is a concern with any engine - IP2, IP3 or CP), but that a small tweak in something deeper which causes even perfectly harmless code to be harmful. And since the IP2 engine isn’t support there is no ‘undo’, there is no ‘patch’, there is only ‘we are all offline with no access to any project data or email for at least week, and any files we have sent out in the last month could also be compromised…’. And the script which was run in the editor could just be OUT = IN[0]

Now I am making a boogeyman out of a shadow here, but it is a real concern which we shouldn’t take lightly as AEC is more likely than most industries to have to deal with such issues (mostly because we don’t care about security beyond talking a tough game).

1 Like

I appreciate IP2.7 can lead to self authoring code etc. so it’s a risk in that aspect in that anything can potentially be a problem, subject to deeper vulnerabilities that let it in. I’ve had a difficult discussion with IT about it already and I guess providing a package which sets precedent for others that probably wont have such a discussion is another ballgame altogether.

Food for thought for sure. Thinking realistically I’m very likely going to just switch the majority to CP3 in a near future build and keep what is needed down in IP2.7 until I find a clearer direction that suits my time.

1 Like

In my testing (not your code but others) most upgrades fine. Family loading and UI stuff is most impacted. :slight_smile:

1 Like