Python loop slow to find closest points to lines

newshunhk · August 17, 2022, 1:45am

Hi,

I have big lists and feed in a python script (indexbydistance) but takes extremely long time to run. (Since it will run 37919 x 322 = 12,209,918 times)
Is it possible to simplify the loop, so that it can run faster?

Ewan_Opie · August 17, 2022, 10:36pm

Give this a try.

38000 Pts , 322 Lines = less than 20 seconds, using math formulas to find distances.

# Load the Python Standard and DesignScript Libraries
import sys
import clr
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *

import math 

def closept(p1, p2, p3):
    x1, y1 = p1
    x2, y2 = p2
    x3, y3 = p3
    dx, dy = x2-x1, y2-y1
    det = dx*dx + dy*dy
    a = (dy*(y3-y1)+dx*(x3-x1))/det
    
    # Allow for line length limits
    if a < 0.0:
        a = 0.0
    elif a > 1.0:
        a = 1.0    
        
    return x1+a*dx, y1+a*dy
    
def dist(p1,p2):
    x1,y1 = p1
    x2,y2 = p2
    return math.sqrt( (x2 - x1)**2 + (y2 - y1)**2 )
    
pts = IN[0]

lines = IN[1]

ptList = []
for p in pts:
    ptList.append([p.X,p.Y])

lineList = []
for l in lines:
    lineList.append([[l.StartPoint.X,l.StartPoint.Y],[l.EndPoint.X,l.EndPoint.Y]])

closePt = []
for line in lineList:
    p1 = line[0]
    p2 = line[1]
    dists = []
    for pt in ptList:
        p3 = pt
        p4 = closept(p1, p2, p3)
        d = dist(p3,p4)
        dists.append(d)
    mini = dists.index(min(dists))
    closePt.append(pts[mini])
    
    
OUT = closePt

Hope this helps

newshunhk · August 18, 2022, 1:26am

thanks! I haven’t tested it yet, but may I ask the theory behind?
It seems running the same number of loops as I do? 38000 x 322 = 12,236,000 loops
or it is faster to call a function?

for line in lineList:
    p1 = line[0]
    p2 = line[1]
    dists = []
    for pt in ptList:

Ewan_Opie · August 18, 2022, 1:39am

Yes, it is iterating the same amount.
The only way to reduce this would be to maintain some pre-sorted groupings of data points/lines prior to the comparison of distance calculation, which might be a good thing to try also.

It is faster to call the functions I have shown as they are only dealing with numerical values, not geometric objects.

Calculations are just faster, but you have to know which one to use to substitute for the representative geometry.

I am unsure of the exact code behind the x.DistanceTo(y) function, so I thought to try a purely mathematical approach.

newshunhk · August 18, 2022, 2:51am

let me test and reply you soon, thank you so much for your reply!

Thomas_Mahon · August 18, 2022, 5:11am

Use a quad tree an it will execute in seconds even with millions of points:

newshunhk · August 18, 2022, 11:10am

wow! so many things to learn! thanks! let me check.

leonard.moelders · August 19, 2022, 6:30am

You could also just use the node Geometry.DistanceTo, List.MinimumItem and List.RemoveItemAtIndex to get at first the closest, in a second run the second closest and in another run the third closest:

It may not be the fastest way, but it’s an easy way.

Vladimir · August 19, 2022, 6:35am

What is the target of getting closest points in your work?

If you use Dynamo 2.8 and upper, look into Scipy: scipy.spatial.KDTree — SciPy v1.9.0 Manual

c.poupin · August 19, 2022, 8:55pm

+1 for k-dimensional tree

import sys
import clr
import System
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *

clr.AddReference('Python.Included')
import Python.Included as pyInc
path_py3_lib = pyInc.Installer.EmbeddedPythonHome
sys.path.append(path_py3_lib + r'\Lib\site-packages')
import numpy as np
from scipy import spatial


def get_points_onCurve(curve, step = 0.2):
    out = []
    i = step
    while i <= curve.Length :
        try:
            po = curve.PointAtSegmentLength(i)
            i += step
            out.append([po.X, po.Y, po.Z])
        except:
            break
    return out
    
def get_groupIdx_by_flatIdx(idx):
    curr = 0
    for i, grp in enumerate(lstgroup_pts):
        for j , pt in enumerate(grp):
            if curr == idx:
                return i
            else:
                curr += 1

lstgroup_pts = IN[0]
lstLines = IN[1]

a_arr = np.array([[p.X, p.Y, p.Z] for group in lstgroup_pts for p in group])
ktree = spatial.KDTree(a_arr)

out = []
for l in lstLines:
    pts_arr = get_points_onCurve(l)
    distances, lstindex = ktree.query(pts_arr, k=1)
    counts = np.bincount(lstindex)
    idx = np.argmax(counts)
    # search the group index 
    gpIdx = get_groupIdx_by_flatIdx(idx)
    out.append(gpIdx)
OUT = out

Ewan_Opie · August 20, 2022, 3:06am

very slick!

newshunhk · August 20, 2022, 4:17pm

superb!!!

newshunhk · August 22, 2022, 4:26am

I just tested it, the KD tree binary search is super fast!
if using old searching method by comparing each value, it takes me more than 30mins to 1hr to finish
if use KDTree, I just need within 1 min!

Topic		Replies	Views
Closest Point to the nearest Parameter at Point, then Shortest Distance Revit	2	333	August 14, 2021
Compute Closest Points between two curves Revit python	12	270	January 15, 2024
Method Python Faster exist? Lists-Logic designscript , python	2	285	November 13, 2022
Python script needs some simplification Revit python	0	215	April 14, 2020
Help with Python code (Connect the dots with only one iteration) Developers python	0	1132	October 26, 2018

Python loop slow to find closest points to lines

Related Topics