ML Nodes from Lunchbox

Hello all!

I am trying to create a graph that will analyze Room Names and then make a guess at a space type, based on the similarity of the Room Name and the Space Type Name (eg. Room: “-10 F FREEZER ROOM” Space Type: “Refrigerated Spaces”).
I cannot find any descriptions of how the ML nodes from Lunchbox work or if I can use them with strings & not numbers. Can any one tell me where I could find this info? Or if anyone knows how I can do this and wouldn’t mind giving me a brief explanation, I’d really appreciate it :smiley::+1:

:man_facepalming:Forgot to mention. I understand that I could just use many nested IF statements or some similar type of rule set (and will if necessary), but my company has about 50+ Space Types and 1100+ unique Room Names so this would be a bear. I also wanted the graph to get better at its guesses and learn from new room names that it has never seen before. Many Thanks!!

Hi Jackson is a ressource you can use to learn about the way the package works. It is on grasshopper but the underlying principles and software library are the same.

Machine learning models don’t take inputs as letters. They need numbers. That’s because they use algebra, they are no more than mathematical models. One way to circumvent this is to collect all different characters in the room names, and assign a number to each of them.
But I am unsure if Lunchbox nodes accept sequences, I need to check !

Edit: Could you upload a graph showing what you tried ?


I posted something similar yesterday…

Hope that’s of interest,


1 Like

Thank you very much for the link! I am going to go check that out! The graph that I currently had tried was putting the string values of rooms into the ML nodes. But I can still upload it if you think it would be beneficial.


I had looked at your post as well! I could not figure out where you were getting the values that you hard coded into the ML node, such as “Alpha” or “Seed”. So I was trying to figure that part out. The rest was very helpful though :+1:

1 Like

Seed is mentioned in the link, but for the rest I was just guessing and playing around, that’s probably why it didn’t work :smiley:

If you find out anything let us know…! :smiley:

Machine learning models don’t take inputs as letters. They need numbers.

The method shown in the video gives a number per input as a ‘key’ but I like the idea of automating it!

1 Like

It’s awesome that you are taking a look at the ML nodes as well! I did what @jonathanatger and @Mark.Ackerley touched on and dissected the video a bit at the link. I also looked at the source code on their bitbucket to get a grasp of the inputs.

One thing I noticed in your post,

This is a great use case, but where does this data live? The module can’t just “keep learning” really. You need to have historical data stored somewhere like a CSV. When it encounters a room name that it had not seen before, it would make a best guess based on the historical data and move on. It wouldn’t store that part by default unless you tell it too after you agree that it’s best guess was good enough. In my example from a few days ago I trained my ML Module on a specific selection of rooms in a model. That is the data it knew and all it would every know until I retrain it.

In short: The ML Nodes are only as smart as the data you train them with.

If you were to upload some sample files I am sure you would get some more help.

Another great resource is the repository for the TheSaurus Dynamo extension. GitHub - mitevpi/thesaurus: TT Hackathon 2018 - Autocomplete for Visual Programming Nodes



Thanks for the info. Currently, the Room Names and Space Types live in an Excel File. My supervisor who gave me the assignment thought that this would be a good place to store all of the info as the training data. The idea was that if it ran into a space that it guessed incorrectly on, than the user would manually fix it, and then add the room name and space type to the master Excel file list for the graph to train from the next time around. I was planning on going through and manually assigning each room a space type, but didn’t want to waste my time if the ML nodes were not going to be able to process the data. Here is the crude attempt I made at making a “proof-of-concept” graph. I have also uploaded the list of Room Names and Space Types that we plan on using :blush: Hope this helps! -Jackson
ML Space Types Attempt1.dyn (9.7 KB) Room Names to Space Types.xlsx (38.2 KB)


hi @john_pierson, can this be one option for the data storage?

1 Like

Extensible storage is an interesting topic and I’ve seen it used quite a bit.

I do think for ML you need the data to be available beyond the active Revit model though and extensible storage lives on the one model you write it to.

If it were me I would look at a database of some sort for historical data based on my (or my clients) portfolio of projects.

That being said, I was more referencing the need for training data of some sort no matter where it is stored.


Thanks @john_pierson, i totally agree :slightly_smiling_face:

1 Like

Interesting idea, which I have been trying to do too, but assigning them to a Uniclass classification, as per the UK BIM level 2 requirements.

I have some historical room name data. I created a list of keywords and matching classification, and attempted a very simplistic “bag of words” to help evaluate matches between keywords and room names. So far it works pretty well, but it can’t “guess” any Classifications that I don’t have in my excel file, so I was hoping to get into the ML side.

Maybe we can join forces?

@danielU3R39 I have run into a similar problem as this attempting to make a work-around of the ML.

I can’t say that I am at all familiar or knowledgeable about the UK classification system. But could add to your list of historical data some generic room names that would fall into your unused classifications? That way your “bag of words” can now guess at those previously unused classes. I am attempting to do this currently with my project. I am also using a package called FuzzyDyno, which uses fuzzy logic to compare similarity with strings and analyze the best matches. Again, I could be totally wrong and this might not help at all. ….But, this might be simpler than trying to use ML

The UK classification system is “Uniclass” as opposed to Omniclass in the USA. But Uniclass is more regularly updated, and hopefully will be universally adopted.

I have been thinking about using the fuzzy logic, because I know there are commonly spelling mistakes, and this will solve any of those.

@Jackson_Stellhorn any additional help with commonly used room names would be great! I’m still hoping for other engineers or architect could also help. Most ML algorithms need 1000s of examples to work well.

Private message me, and maybe we can share scripts and workflows?

1 Like

Quite frankly I’m interested as well in participating in workflows. Unfortunately, I’m in France : the rooms I have access to have therefore french names. However, I may be able to contribute in that I’m developing tools to extract data from Revit models, specifically with machine learning in mind :


@jonathanatger, I checked out your blog, looks good, I’ll check the Morpheus package. This would be very helpful in getting commonly used room names from lots of models, but still needs manual classification once you have room names.

@jonathanatger, @Jackson_Stellhorn, @danielU3R39, @Mark.Ackerley, @john_pierson, @jean
Have either of you used FuzzyDyno to accomplish the task outlined in this thread (or similar)?

If yes, did was anyone successful with it? If yes, would you be willing to share the fuzzy list created for the task?

I use fuzzy logic (that I learned from FuzzyDyno) for my element filters in Rhythm. I wanted to be safe for filter selection for when someone spells “Equals” as “equals” or even something like “eqalz”.