Group Nearly Points by Location

Hi,

I have a list of points. My goal is to group these nearly points by the location

Please find attache a diagram:

Thank you.

Ah you edited it :smiley:

If you know your maximum allowable distance to form a group, you could voxelize the points and find the ‘gaps’ by taking the void voxels. From there you have a set of cells which you can query or group by.

3 Likes

This problem you can refenrece algorithm Kmeans : k-means clustering - Wikipedia

4 Likes

This is relevant, but not a direct solution as K Means requires you know how many groups you have to start with.

1 Like

Hi,

There are always different ways of doing it.

Here’s some examples
Gaussian Mixture:

K-Means:

Specktrale Clustering:

If you have an idea of the minimum distance between them, you can make a grouping again.

ML_

2 Likes

I may be misunderstanding here, but when I compare your image to the original image I see very different results. When I see the original image the Red box only has points which are within N units of each other. The points in the blue box and cyan boxes are >N units away away from all points in the red box, and so those are in a separate group.

And applying that to the datasets showing the colored points, we’d always have 1 color as if any point on the grid is within N units of another than all points in the grid are within N units of each other.

It may be that the un-exposed Python in the animation is doing this, but I can’t tell offhand.

1 Like

I think there is no harm in saying no to this. Because the K-Means algorithm first learns how many groups the user wants. Let’s call it “n”. Then it selects “n” number of centre points in the whole data grid and groups them by looking at the distances between them. If you want n groups, you will see n different colour groups.

The algorithm played in gifs is very different from image files. Here the user chooses the points that can be the single centre. Let’s call this “index”. Then he gives an upper limit. Let’s call this “Distance”. The algorithm regroups the entire data set according to the centre point the user has selected from the index and the distance value. In the example I have tried to show the state of each centre point in the data set. We can also observe how the whole data set is grouped by specifying one or more centre points.

I know, it sounds very complicated, but it is very simple and effective.

1 Like

Might be what they are after, but from what I can see it doesn’t align. Might help if you shared the python script which is doing the bulk of the work, or the dyn file.

1 Like

“Thanks for your response~
I need the maximum allowable distance to form a group.
K Means is an interesting solution if the k value can be automatically classified according to distance.”

Hi Durmus_Cesur
May you shared the python script which is doing the bulk of the work, or the dyn file.
thanks a lot ~

Example of a Voxelized Rabbit for reference. :+1:

Unit values need to be converted for good calculation though, where voxel coordinate sizes (which like to int’s not floats) need to be multiplied if point values are less than 1.0 int unit.

Then grouping populated voxels by their proximity can sort the identified voxels into voxel groups.
You just need to find the sweet spot for voxel size for your data set.

2 Likes

Hi,

I’m sorry it’s been so long. I’ve been going through a bad time, so I haven’t had time to look here. :face_vomiting:

You can find the code and script here.
Home.dyn (54.9 KB)

“X_test” compares all “X_train” data according to each data input into “X_test” and estimates their indexes and then their distances. It “weights” them according to their distance.
Packages you need: Numpy, Shapely, Sklearn, Pandas, Tensorflow

# Copyright(c) 2023, Durmus Bayryam
# Gıthub : https://github.com/DurmusCesur/Shapely.git
import clr
import sys
import re
import System

clr.AddReference('Python.Included')
import Python.Included as pyInc
path_py3_lib = pyInc.Installer.EmbeddedPythonHome
sys.path.append(path_py3_lib + r'\Lib\site-packages')

import tensorflow as tf
from tensorflow.python.client import device_lib
from sklearn.neighbors import RadiusNeighborsRegressor
import numpy as np
import pandas as pd
from io import StringIO 

# Input
column = IN[1]
data = pd.read_csv(IN[0], usecols=column)
X_train = data.drop(columns=["label"])
y_train = data["label"]
X_test = IN[2]
radius = IN[3]
algorithm = IN[4]

sys.stdout = StringIO()

def calculate_weights(distances, scale_factor=1.0):
    weights = 10000 / (distances + scale_factor)
    return weights

# Training
model = RadiusNeighborsRegressor(radius=radius, algorithm=algorithm)
model.fit(X_train, y_train)


# Prec
y_test_pred,weight = [],[]

for test_example in X_test:
    neighbors_indices = model.radius_neighbors([test_example], return_distance=False)[0]
    y_test_pred.append(neighbors_indices)
    
    #distance
    neighbors_distance = model.radius_neighbors([test_example], return_distance=True)[0]
    weights = calculate_weights(neighbors_distance)
    weight.append(weights)
    


# Result
sys.stdout.seek(0)
readerstdout =  sys.stdout.read()
   
OUT = y_test_pred,weight

3 Likes