Hi,
I have a list of points. My goal is to group these nearly points by the location
Please find attache a diagram:
Thank you.
Hi,
I have a list of points. My goal is to group these nearly points by the location
Please find attache a diagram:
Thank you.
Ah you edited it
If you know your maximum allowable distance to form a group, you could voxelize the points and find the âgapsâ by taking the void voxels. From there you have a set of cells which you can query or group by.
This is relevant, but not a direct solution as K Means requires you know how many groups you have to start with.
Hi,
There are always different ways of doing it.
Hereâs some examples
Gaussian Mixture:
K-Means:
Specktrale Clustering:
If you have an idea of the minimum distance between them, you can make a grouping again.
I may be misunderstanding here, but when I compare your image to the original image I see very different results. When I see the original image the Red box only has points which are within N units of each other. The points in the blue box and cyan boxes are >N units away away from all points in the red box, and so those are in a separate group.
And applying that to the datasets showing the colored points, weâd always have 1 color as if any point on the grid is within N units of another than all points in the grid are within N units of each other.
It may be that the un-exposed Python in the animation is doing this, but I canât tell offhand.
I think there is no harm in saying no to this. Because the K-Means algorithm first learns how many groups the user wants. Letâs call it ânâ. Then it selects ânâ number of centre points in the whole data grid and groups them by looking at the distances between them. If you want n groups, you will see n different colour groups.
The algorithm played in gifs is very different from image files. Here the user chooses the points that can be the single centre. Letâs call this âindexâ. Then he gives an upper limit. Letâs call this âDistanceâ. The algorithm regroups the entire data set according to the centre point the user has selected from the index and the distance value. In the example I have tried to show the state of each centre point in the data set. We can also observe how the whole data set is grouped by specifying one or more centre points.
I know, it sounds very complicated, but it is very simple and effective.
Might be what they are after, but from what I can see it doesnât align. Might help if you shared the python script which is doing the bulk of the work, or the dyn file.
âThanks for your response~
I need the maximum allowable distance to form a group.
K Means is an interesting solution if the k value can be automatically classified according to distance.â
Hi Durmus_Cesur
May you shared the python script which is doing the bulk of the work, or the dyn file.
thanks a lot ~
Example of a Voxelized Rabbit for reference.
Unit values need to be converted for good calculation though, where voxel coordinate sizes (which like to intâs not floats) need to be multiplied if point values are less than 1.0 int unit.
Then grouping populated voxels by their proximity can sort the identified voxels into voxel groups.
You just need to find the sweet spot for voxel size for your data set.
Hi,
Iâm sorry itâs been so long. Iâve been going through a bad time, so I havenât had time to look here.
You can find the code and script here.
Home.dyn (54.9 KB)
âX_testâ compares all âX_trainâ data according to each data input into âX_testâ and estimates their indexes and then their distances. It âweightsâ them according to their distance.
Packages you need: Numpy, Shapely, Sklearn, Pandas, Tensorflow
# Copyright(c) 2023, Durmus Bayryam
# Gıthub : https://github.com/DurmusCesur/Shapely.git
import clr
import sys
import re
import System
clr.AddReference('Python.Included')
import Python.Included as pyInc
path_py3_lib = pyInc.Installer.EmbeddedPythonHome
sys.path.append(path_py3_lib + r'\Lib\site-packages')
import tensorflow as tf
from tensorflow.python.client import device_lib
from sklearn.neighbors import RadiusNeighborsRegressor
import numpy as np
import pandas as pd
from io import StringIO
# Input
column = IN[1]
data = pd.read_csv(IN[0], usecols=column)
X_train = data.drop(columns=["label"])
y_train = data["label"]
X_test = IN[2]
radius = IN[3]
algorithm = IN[4]
sys.stdout = StringIO()
def calculate_weights(distances, scale_factor=1.0):
weights = 10000 / (distances + scale_factor)
return weights
# Training
model = RadiusNeighborsRegressor(radius=radius, algorithm=algorithm)
model.fit(X_train, y_train)
# Prec
y_test_pred,weight = [],[]
for test_example in X_test:
neighbors_indices = model.radius_neighbors([test_example], return_distance=False)[0]
y_test_pred.append(neighbors_indices)
#distance
neighbors_distance = model.radius_neighbors([test_example], return_distance=True)[0]
weights = calculate_weights(neighbors_distance)
weight.append(weights)
# Result
sys.stdout.seek(0)
readerstdout = sys.stdout.read()
OUT = y_test_pred,weight