Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Comments on Python Regex to parse multiple "word. word. word."
Post
Python Regex to parse multiple "word. word. word."
I'm trying to parse lines like "THIS. THAT..OTHER " so that "THIS. THAT."
is found. There can be more than one <word><dot>
separated by a space except no space after the last one. Based on the initial feedback, I've added more example lines to the lines array below.
My current results are:
re.compile('\\s?(?P<name>(?:\\w+\\.\\s(?=[\\.\\s]))+)', re.VERBOSE)
REGEX FAILED: [THIS. THAT..OTHER
Here is the current code. I have tried several different regexes trying to find one that seeks the blank space after a dot for all but the last word/dot combination.
lines = [
'THIS. THAT..OTHER ',
'THIS. THAT. ANOTHER..OTHER ',
'THIS..OTHER ',
]
pat = re.compile(r"\s?(?P<name>(?:\w+\.\s(?=[\.\s]))+)", re.VERBOSE)
print(pat)
for line in lines:
x = pat.match(line)
if x:
if hasattr(x, 'groupdict'):
print(x.groupdict())
else:
print(x.groups())
else:
print("REGEX FAILED: [{}]".format(line))
1 comment thread