Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Why python regexps look expecting a begin match, but not an ending one?

−0

My impression is that the regexps behave a little bit odd:

>>> import re
>>> r=re.compile("test")
>>> r.match("test")
<re.Match object; span=(0, 4), match='test'>
>>> r.match("1test")
>>> r.match("test2")
<re.Match object; span=(0, 4), match='test'>
>>> r.match("1test2")
>>>

I have also tried python-pcre, it behaves on the same way. If the "regexp" is only a single word, it should behave as a substring match (or a full-line match). It seems matching lines starting with "test", but not the ones ending with it (or containining it somewhere).

Why?

regex python-3

posted 2 days ago

CC BY-SA 4.0

peterh‭

16 reputation 1 0 3 0

Raw

Markdown

History

is a duplicate

This question has been asked before and has already been answered. It should be marked as a duplicate.

Please enter the URL of the proposed duplicate in the details field below.

not constructive

This question cannot be answered in a way that is helpful to anyone. It's not possible to learn something from possible answers, except for the solution for the specific problem of the asker.

1 comment thread

About "why" questions in programming (and about the Q&A site model used on Codidact) (1 comment)

2 answers

Score Active Age

−0

Worked for peterh‭

The following users marked this post as Works for me:

User	Comment	Date
peterh‭	(no comment)	Jun 4, 2025 at 13:24

The documentation for re.match(...) is explicit that it only matches at position 0.

If you're asking this question, what you probably want is re.search(...) to match at any point within the string.

posted 1 day ago

CC BY-SA 4.0

1d ago

Michael‭

1041 reputation 4 25 111 77

Copy Link

Raw

Markdown

History

1 comment thread

Thank you very much - exactly, `r.search()` is what I should have used. (1 comment)

−0

It seems matching lines starting with "test", but not the ones ending with it (or containining it somewhere).

Yes, re.match matches lines that start with the described pattern:

If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding Match.

To check that the pattern describes the entire line, use re.fullmatch; to do a substring match, use re.search.

There's also a separate heading in the documentation describing these differences.

Besides choosing the operation to do with the regex, you can also use "anchors" for the pattern that only match at the beginning or end of the string. To match at the beginning, use ^ as the first character in the regex; to match at the end, use $ as the last character. These are "zero-width" matches; when the regex engine checks for them, it doesn't associate them with any characters from the input string - it only checks the current position as it's matching.

Why?

Of course we should have different functions to do different things. The remaining question is why we should have a match at all.

The simplest explanation I can think of is that it's easy to implement efficiently, and often useful. In particular, you can easily and efficiently implement fullmatch in terms of match - first match, and then see whether there is anything left in the input string after matching the pattern. But we can't do it the other way around: if we only have fullmatch and want to get the match effect, we can only modify the regex pattern to have "also match any characters after that" (.*), and matching against that takes extra time (or special work to optimize the regex engine).

Meanwhile, a search for a substring must be slower - in the worst case, you basically need to check at every position in the input.

but not the ones ending with it

It should be noted that there isn't an efficient way to check whether the input ends with a regex. That's because matching a regex requires scanning forwards in the string from some starting point, but the regex pattern doesn't match a fixed amount of data. Therefore, if the pattern matches at the end, we don't know where to start looking - we need to do the slow searching operation, and then see if one of those matches is at the end.

posted about 2 hours ago

CC BY-SA 4.0

Karl Knechtel‭

2475 reputation 40 60 316 87

Copy Link

Raw

Markdown

History

Communities

Why python regexps look expecting a begin match, but not an ending one?

1 comment thread

2 answers

1 comment thread

0 comment threads