# Bimorph - Curve.RemoveDuplicates Optimized?

@Thomas_Mahon – Should we be seeing the same sort of speed increase from the Curve.RemoveDuplicates node in Bimorph?

For my purposes, I solved the issue a different way.

Could you share how did you solve it will help others with similar issues.

Of course!

I have 2 lists of curves and I wanted the curves that were common in both. I tried a version of SetIntersection previously without success so I thought merging the lists and dropping the duplicates would work as well. It was pretty slow in comparison to the other aspects of the graph so I rolled my own by converting the lines into strings and comparing the strings in a Python script I cobbled together.

It’s a bit of a hack but it’s fairly fast and accurate enough for my needs.

1 Like

Hi Greg

Glad to hear you found an efficient workflow (efficientcy is king!).

I’ll answer your question in case anyone else has the same question:

It is optimised, but it’s not going to be anywhere near as fast as the new nodes for the following reasons:

1. There is no purpose built method in the API to perform the process, which means:
2. I’ve written a custom algorithm to establish if curves are duplicates
3. The curve comparison process is optimised (I don’t perform brute-force checking as it’s unnecessary - it uses the same logic as Curve.IntersectAll to prevent redundant tests)

As a result it won’t run as fast as the new intersection node. In addition, checking for duplicates is a complex problem, irrespective of which method/algorithm is used to solve the problem. I did a presentation on this node at one of the Dynamo London user groups which explains the inner workings of the geometry checks the algorithm performs to identify duplicates:

In respect to lines then, they could be better handled since all three tests are required to handle all curves types (bsplines/nurbs, arcs, ellipse, ect), whereas only the first test is needed for lines and that would explain why its possible to yeild faster results. It boils down to code consistency: have one method to rule them all (and accept some profligacy in return for code brevity) or create exceptions for certain object types (ie lines) and create rules to handle the exceptions. For speed of development, I opted for the former, but its a good question and I’ll review the nodes functionality.

1 Like

Actually, the solution I came up with isn’t even comparing the end points… it’s comparing the string version of the line output! Super non-technical, I know, but at the time it was all I could come up with and I’m pretty sure for my situation it’s sufficient.

For smaller data sets I found the node you produced to work great. It’s when I was trying to compare several thousand lines at a time when things slowed to a crawl.

Out of curiosity, are you first checking the intersection of the curves bounding boxes and then only comparing those curves whose boxes intersect?

The speed will be dependent on a number of factors, such as the number of overlaps and the type of curves being processed. However, its an intensive process as there are potentially billions of exceptions. So the rules that have been implemented are the bare minimum to capture 99.9% of duplicates. Its designed for versatility which always restricts efficiency.

It doesn’t use bounding boxes to filter surrounding elements like the new geometry intersection nodes. I looked it at but never got round to testing whether it would speed the process up considering it would rely on BoundingBox.ByGeometry and BoundingBox.Intersects from Dynamos ProtoGeometry library (there are no methods in the Revit API to get a bounding box from a Curve, nor test for containment). In theory, it should be quicker, but without testing I cant say for sure and so I didn’t implement it.

If you can give me some use cases where the performance is impacting your workflow, I can revisit this node and look at improving performance.

I’ll export the endpoints of the largest dataset and send it your way.

1 Like

@Greg_McDowell Curve.RemoveDuplicates has been refactored in BimorphNodes v2.2 resulting in an 80% performance increase. It will be released today.

3 Likes

Sounds great! Can’t wait to get back into that definition and see how it works.

I guess I never did send you that dataset from back in October. Sorry about that.