Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

85%

+10 −0

Q&A What does the "\s" shorthand match?

I've seen some regular expressions (regex) using \s when they want to match a space, but I noticed that it also matches line breaks. Example: the regex [a-z]\s[0-9] (lowercase ASCII letter, follow...

1 answer · posted 4y ago by hkotsubo‭ · last activity 3y ago by hkotsubo‭

Question regex whitespace

#2: Post edited by

Alexei‭ · 2021-09-17T14:47:48Z (over 3 years ago)
added relevant tag

Copy Link

Raw

Markdown

regex

regex whitespace

#1: Initial revision by

hkotsubo‭ · 2021-06-07T13:26:36Z (almost 4 years ago)

Copy Link

Raw

Markdown

What does the "\s" shorthand match?

I've seen some regular expressions (regex) using `\s` when they want to match a space, but I noticed that it also matches line breaks.

Example: the regex `[a-z]\s[0-9]` (lowercase ASCII letter, followed by `\s`, followed by a digit) matches both `a 1` and

```none
b
2
```

Because `\s` matches either a space or a newline (see this regex running [here](https://regex101.com/r/5ZS2vX/1/)).


But I also noticed that, depending on the programming language I use and/or specific settings on their regex API, it may or may not match some other "Unicode spaces", such as the [No-Break Space](https://www.fileformat.info/info/unicode/char/A0/index.htm).

Hence, the question: what does the `\s` shorthand actually match? Does it depend on the language, or there are any other factors that can change its behaviour? Can I always assume that at least spaces and newlines (or any other fixed set of characters) will be matched?

regex

Communities

Post History