Python.Extract numbers of srings list which are in between brackets and leave a prefix and the number as result

Hello,

Extract numbers of srings list which are in between brackets and leave a prefix and the number as result. For example:

original list=[[“cat/badboy [3421]”],[“cat/goodboy [5634]”]],[[“cat/badboy [3421]”],[“cat/goodboy [5634]”]]
result desired=[[“cat/3421”],[“cat/5634”]],[[“cat/3421”],[“cat/5634”]]

Basically I would do this: remove any string between “/” and “[” and remove also “]”.

with nodes I wuld do it with string.split with separators as “/”,"[","]" and remove from the 3 results, the one in the middle and concatenate in a single string those remaining results.


extractnumberfrombracketsandremainprefix.dyn (12.6 KB)

or I would extract only the number between brackets and add the same prefix “cat/” in front of the number extracted.

I got this python code to extract numbers from string but I do not know how to add the condition to take the number between brackets:

import clr

# True if character is a number, false otherwise
def IsNum(char):
    try:
        int(char)
        return True
    except ValueError:
        return False


# Join a single list of characters into a string
def JoinChars(chars):
	if not chars:
		return None
	elif len(chars) == 1:
		return chars[0]
	else:
		return "".join(chars)


# Return a list of contiguous numbers from a string
def StringToNumbers(str):
	# Extract lists of contiguous number characters
	nums = []
	lastWasNum = False
	ls = []
	for i in range(len(str)):
		if IsNum(str[i]):
			ls.append(str[i])
			lastWasNum = True
		elif str[i] == "." and lastWasNum:
			ls.append(str[i])
		elif lastWasNum:
			nums.append(ls)
			ls = []
			lastWasNum = False
		
		if i == len(str)-1 and ls:
			nums.append(ls)

	# Cast character strings to numbers
	if not nums:
		return [None,None]
	elif len(nums) == 1:
		asStr = JoinChars(nums[0])
		return [asStr,float(asStr)]
	else:
		asStr = []
		asNum = []
		for i in range(len(nums)):
			oneStr = JoinChars(nums[i])
			asStr.append(oneStr)
			asNum.append(float(oneStr))
		return [asStr,asNum]


str = IN[0]
OUT = StringToNumbers(str)

Regular expressions will be your friend here, and in fact may be the best path forward for many of the string questions you are running into. Worth taking the time to study up on them.

This might work, but I haven’t tested.

Import re
baseString = IN[0]
numbersInBrackets = re.findall(‘[\d]’, baseString)
3 Likes

You could also do it with .split in python:

2 Likes

How about:

original_list = [[“cat/badboy [3421]”], [“cat/goodboy [5634]”]],[[“cat/badboy [3421]”],[“cat/goodboy [5634]”]]

def find_between( s, first, last ):
try:
start = s.index( first ) + len( first )
end = s.index( last, start )
return s[start:end]
except ValueError:
return “”

out =
for lst in original_list:
for i in lst:
for s in i:
xx = find_between (s, “[” , “]”)
yy = find_between (s, “” , “/”)
zz = [xx + “/” + yy]
out.append(zz)

print (out)


output is

image

Retyping this from another source, but this should do the trick if formatting is right:

import re
strings = IN[0]
results = []
for str in strings:
   found = re.findall(“(\w+)/|\[\d+\]”, str)
   results.append([f for set in found for f in set if f != “”])
OUT = results
2 Likes

I tired that way first but it didn’t seem to like the square brackets so I did it another way.

Another example with regex

import sys
import re

original_list = IN[0]
OUT = [[re.sub(r'(cat\/).+?(\d+).', '\g<1>\g<2>', i) for i in sublst] for sublst in original_list]
4 Likes

It’d be cool to see some notes on there.

Hi,

  1. In this example I am using a regular expression with 2 group capture (using parenthesis), and the whole regex constitutes a full match

  1. once the groups are established, I use the re.sub() method to replace the full match by the concatenation of the 2 groups with '\g<1>\g<2>' to access them
7 Likes

Ah! Didn’t know about the sub function before! Great find. :slight_smile:

1 Like

Hello @jacob.small I testes this and I get a warning that I do not understand, I imagine it is something typed wrong between " ".

The solution of @c.poupin gave me same identical result that the string input so maybe it does not work with sublists or did not have the function to find a number between brackets.

The solution of @Alien makes Dynamo not responding like crashing for many minutes and the result looks like millions of single item sublists with the value “/”, so I do not know what can be wrong.

def find_between(s,first,last):
	try:
		start = s.index(first) + len(first)
		end = s.index(last,start)
		return s[start:end]
	except ValueError:
		return ""

cropedlist = []
for lst in parentjoinedlist:
	for i in lst:
		for s in i:
			xx = find_between(s,"[","]")
			yy = find_between(s,"","/")
			zz = [xx+"/"+yy]
			cropedlist.append(zz)


OUT = cropedlist

something to mention and perhaps the issue is that the input list contains sublists in which not all items have numbers between brackets and “/” separator with string in between

Perhaps one of your input data isn’t a string.

Try placing it in a try except statement where for the exception you return the input itself.

I didn’t do mine in dynamo… Maybe that’s why?

I do not know what to do if all is string based on this OOTB node, how would be with a try except statement in the python code?

Ok… let’s start by processing one at a time then.

Are you processing at the same level, or are you in a 3 deep list instead of a 2 deep one?

I am doing like the sample with sublists with list of items, but the items with brackets [ ] are in a variable index, so there are many other items, without brackets contained in the string. All items of any sublist are strings.
image

Checking the dataset above in Python and it’s working. Can you run any of the regex solutions above against that list of 4?

1 Like

It was too much nesting …
Blame Covid… I have it and my brain is slow :mask:

2 Likes

hello it gives me result, although the result now is sublists of one item and all the strings are modified or removed, majority are replaced by “/”, and the items which had number between brackets looking like number first plus “/” plus the prefix. I think this needs condition that if string does not contain number between brackets that do not edit the item leave as it is.

I tried all the solutions provided and this is the outcome in screenshot and file dynamo
extractnumberfrombracketsandremainprefix.dyn (24.0 KB)
:
@jacob.small solution 1


@jacob.small solution 2

@WrightEngineering solution 1

@c.poupin it works but it can modify other string with separator as space or “-”, so very dangerous
@Alien it does something not wanted

the solution of @c.poupin apparently works in that simple example but in my real case it modifies of other string item values not expected.

the solution of @Alien looks interesting with split method but for some reason it is getting wrong items of sublists given, maybe it needs to work with sublists.

the solution of @jacob.small looks interesting but I do not understand the warnings at all, new for me.

Nothing works yet, perhaps changing the sample exercise or the question would be easier. I am thinking only extract the number between brackets replacing the original string and leave the rest of string that do not contain a number between brackets

Can you copy EXACTLY what you’re feeding into the nodes please?

I fed in exactly what you wrote in your first example:
original list=[[“cat/badboy [3421 ]”],[“cat/goodboy [5634 ]”]],[[“cat/badboy [3421 ]”],[“cat/goodboy [5634 ]”]]

If you’re mixing up the lists with things on different levels of nesting then my code needs a tweak as it’s only looking at one level.

1 Like