Get capital letters by condtion

Hello,

i use already a script, it works partly well

grafik

can is say something like “get split string if it has 3 capital letters”

so i get just the values GIT,GIP,STB,… ?

these values are not always at the same place!

KR

Andreas

Have you tried searching for Fuzzy Logic on the forum?
I might have some clues

1 Like

@Marcel_Rijsmus ,

it leads to you :wink:

1 Like

Hi,

If you just wanna get the 3 capitals in a row this is my logic behind it.
I am sure it can be made a lot simpler.

input_list = IN[0]
uppers = []

for sent in input_list:
	count_upper = 0
	for i in sent:
		if i.isupper():
			count_upper+=1
		else:
			count_upper = 0
		if count_upper == 3:
			first_two = sent.index(i)
			first = sent[first_two - 2]
			second = sent[first_two - 1]
			third = i
			uppers.append(first+second+third)
			continue
# Assign your output to the OUT variable.
OUT = uppers

Hope it helps.

2 Likes
  1. Check to see if a substring is all caps by converting it and comparing.
  2. Check to see if a substring is 3 characters in length.
  3. Filter for substrings that meet both of those requirements.
4 Likes

@Josip.Komadina ,

that works really fine, can exclude symbols like “_”

thats the only beauty case that i have to avoid!

KR

Andreas

Weird behaviour.
Maybe you can just put and isalpha() in the if statement and solve everything.

input_list = IN[0]
uppers = []

for sent in input_list:
	count_upper = 0
	for i in sent:
		if i.isupper() and i.isalpha():
			count_upper+=1
		else:
			count_upper = 0
		if count_upper == 3:
			first_two = sent.index(i)
			first = sent[first_two - 2]
			second = sent[first_two - 1]
			third = i
			uppers.append(first+second+third)
			continue
# Assign your output to the OUT variable.
OUT = uppers
2 Likes

The count is off because you reset your counter instead of just not adding to it. You also stop as soon as you find 3 consecutive capitals instead of checking to see if the substring is exactly 3 characters long.

Even with python, I still think you need to split and check substrings (if that’s the specific condition you’re looking for, @Draxl_Andreas). This also assumes that there’s only a single 3 character condition in each string.

@Nick_Boyts ,

thats right, naming is still not clean, first i have to chang the name of the walls to something like a naming convention.

KR

Andreas

Something like this:


If you can ensure you only have one condition you’re looking for per string then you can of course remove the extra loop for sublists.

3 Likes

You are right that it needs to check the entire substring if its not formatted correctly.

My thought process was if you don’t reset the counter it will count the every capital character in the entire string.
By resetting it you can get exactly 3 in a row.
My assumption was that you have only 3 uppers in each string that represent something and that you need them.
Seems to not have problems if you are only looking for 3 upper chars in a string.

1 Like

Nope, I think you’re completely right there. That was my fault. I just lumped that in with the potential under-counting.

This would also be the better process if they weren’t already broken into substrings with the formatting. That just makes things easier.

2 Likes

@Draxl_Andreas You can do something like this with Regex:

import re
pattern = r'(?<![A-Z])[A-Z]{3}(?![A-Z])'
OUT = [re.findall(pattern,x) for x in IN[0]]
7 Likes

@AmolShah ,

thats pretty short, and it looks like “chinese” :wink:

i lost a bit the track… how to manage the count of the doors(and host)… ?

i try to replace the errors by a “dummy” value


KR
Andreas

That’s just because you have empty lists for cases that don’t match the condition. You can replace them with nulls (or empty strings) if you need to keep list structure and item count.

2 Likes

@AmolShah ,

for any reason it generates more values as input can i also so

if "_XXX_" # take it 

the issue is when there are 3 upper letter in the same string. It does create a new item


grafik

can i create something like this?

grafik

but it is a progress, it solves 90% of my cases

KR

Andreas

Hi,

Here is how I would do it with my limited knowledge :smile:

import re
pattern = r'(?<![A-Z])[A-Z]{3}(?![A-Z])'
new = [re.findall(pattern,x) for x in IN[0]]

strings = []
empty = ""
for n in new:
	if len(n)>1:
		for i in n:
			empty = empty + "_" +i
		strings.append(empty)
	else:
		strings.append(n[0])
OUT = strings

image

Hope this helps.

2 Likes

@Josip.Komadina ,

thats good and easy to detect! i can do it manualy the final 10%

KR

Andreas

1 Like

Hi, I also tried limiting myself to capital letters between “" and "


import sys
def spec(a):
    b=a.split("_")
    del(b[0],b[-1])
    c=[b[i] for i in range(len(b)) if b[i].isupper()]
    return c
lst=IN[0]

OUT = [spec(i) for i in lst]

cordially
christian.stan

2 Likes

@christian.stan ,

in your version is missing when there are 2 times upcase…

f. e. “bla_bla_CAR_bla_bla_DOG” → CAR_DOG

KR

Andreas