How to remove duplicate number in the sub list, And retain maximum duplicate number content in sub list

sjafarali75 · June 3, 2024, 10:03am

"Hi Dynamo Community,

Is it possible to identify duplicate numbers within sublists and retain only the maximum duplicate number content in each sublist?

For example, in the first image, the actual list is as shown below:

And in the second image, I want the content to be as shown below:

Thank you."

Draxl_Andreas · June 3, 2024, 10:15am

@sjafarali75

check out this topic, i think you can modify the issue to your aims…

sjafarali75 · June 3, 2024, 10:31am

Hi @Draxl_Andreas,

Thank you for your response. However, the issue I’m encountering is within the sublists, not the main list. Additionally, I’m not very familiar with Python. Could you please assist me with this?

sjafarali75 · June 3, 2024, 10:40am

I tried using ChatGPT for this issue, and it provided some code. However, the code it generated produced the same result as the original list.

Nick_Boyts · June 3, 2024, 2:18pm

ChatGPT is likely going to struggle with bespoke functionality around specific use cases and would require some very specific language to describe what it is exactly you’re looking for. You can tell from looking at the code that it didn’t understand the request (it’s looking for the maximum value duplicate) and still gave bad code (it removes the item from the sublist only to add it back on the very next line).

What ChatGPT could be helpful with, and is the general exercise I’d suggest for something like this, is to break down and define all the individual parts of your condition first. See if you can:

Identify duplicate values in sublists (either by index or boolean mask).
Identify max values from each duplicate list (either by index or boolean mask).
Identify first instances of each duplicate value (either by index or boolean mask).
Cross-reference each condition for final list of duplicate values to keep and/or which to get rid of.

Python would make this much easier, but if you can work through most of the items above, we can probably help you write something with nodes that can get the trick done.

jacob.small · June 3, 2024, 2:41pm

Nick_Boyts · June 3, 2024, 2:42pm

Watch out, people. AI is coming for your job!

/s

Mike.Buttery · June 4, 2024, 2:09pm

import copy

def remove_dupes(list_of_lists):
    seen = set()
    working_list = copy.deepcopy(list_of_lists)
    for i, sub_list in enumerate(list_of_lists):
        index = []
        for j, val in enumerate(sub_list):
            if val in seen:
                index.append(j)
            else:
                seen.add(val)
        for idx in reversed(index):
            working_list[i].pop(idx)
    return list(filter(None, working_list))

A few tips here

Python will modify the original object so make a deepcopy to ensure original list and sub-lists are maintained.
The pop (extracting a value) method alters a list length so if we work backwards the indexing is maintained. This may have some issues if your sublists have duplicates so we check all the values L → R and remove them in reverse.
The filter removes any empty lists - however in Python3 this is a generator so running list on it will expand the result (Dynamo also does this on processing OUT)

sjafarali75 · June 5, 2024, 3:09am

Hi @Mike.Buttery,

Thank you for your response.

The code worked, but it removed duplicate items from each sublist, making them unique sublists. Allow me to clarify what I’m looking for.

In Picture 1, you’ll see my original list.

What I want is to join the sublists if there are duplicate numbers present, as shown in Picture 2 or 3.

jacob.small · June 5, 2024, 6:30am

This is a exercise for List.GroupByKey. The list is the list as you have it now.

The hard part is creating a key that is a comparable element for each group. A String From Array node with a List.Map method might work there.

Mike.Buttery · June 5, 2024, 8:32am

Can you state you logic clearly - how does it handle these numbers indicated?

sjafarali75 · June 5, 2024, 8:43am

Hi @Mike.Buttery ,
My apologies for the confusion. I’ve now corrected the image. Could you please take a look and assist accordingly?

Mike.Buttery · June 5, 2024, 9:18am

So as I understand it the logic is
For a list of lists

Processing each list from left to right
If the maximum number in a list is duplicated in other lists, append the other lists

import copy

def group_by_max(list_of_lists):
    working_list = copy.deepcopy(list_of_lists)
    result = []

    while working_list:
        sub_list = working_list.pop(0)
        collector = [sub_list]

        for i, next_list in enumerate(working_list):
            if max(sub_list) in next_list:
                collector.append(working_list.pop(i))

        if len(collector) > 1:
            result.append(collector)
        else:
            result.append(sub_list)

    return result

sjafarali75 · June 10, 2024, 5:39am

Hi @Mike.Buttery,

Sorry for the late reply. Yes, this code provided the perfect solution.
Thanks!

Topic		Replies	Views
Remove duplicates in sublists, keep the structure (unique element) DesignScript revit , python , dynamo	2	370	February 9, 2022
Counting and incrementing duplicate values Developers revit , dynamo	9	244	July 22, 2023
Delete duplicate values and update list Developers python , dynamo	2	407	July 7, 2021
Prune duplicates from sublists Lists-Logic	3	1684	October 23, 2016
Comparing Sublists within Same List Lists-Logic python , dynamo	25	340	August 23, 2023

How to remove duplicate number in the sub list, And retain maximum duplicate number content in sub list

Related topics