Non-capturing groups
Putting capturing groups within an expression is a useful way
to both parse an expression and also as a form of organisation, allowing you to say
that certain groups are optional or using the pipe operator to designate a
choice inside a group.
You can use a non-capturing group to retain the organisational or
grouping benefits but without the overhead of capturing. To write a non-capturing
group, you place ?: as the first element inside the brackets:
This example matches either a sequence of digits, or a sequence of digits followed by one of
the prefixes -st, -nd, -rd, -th (e.g. 1st, 4th).
However, only the first group (the digits) is captured. The second group is non-capturing, introduced by
?: inside the brackets. Its sole purpose is to (a) limit the scope of the choice operator– |–
to part of the expression, and (b) allow us to apply the ? operator to that sub-part of
the expression. (Repetition operators such as ?, *, + always apply to
the item immediately preceding; if we want that item to comprise various characters or elements,
then we need to form a group.)
Common mistake: groups aren't optional unless you include a final ?
When you write a non-capturing group that you want to be optional, it's easy to
forget the final ?. If we write the expression as follows:
then the expression will accept any of the alternative suffixes (-st, -nd etc),
but including one of the suffixes will be mandatory for the expression to match.
Don't confuse the ?: placed at the start of a non-capturing group with the optionality
operator ?, placed after a group or item in the expression.
Next topics
Depending on your needs, the following topics are recommended next:
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.