Alternatives in capturing groups
Recall the pipe operator which is used to give a list of
alternative expressions. For example, the following expression
matches any three-letter month name:
jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec
When used inside a group, the pipe separates options that match the group. So to match a date of the format 10 jun 1998 etc, we can use
the following expression:
([0-9]{2}).(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec).([0-9]{4})
(Note here we use the dot to match any character between the elements of the date.)
Optional groups
Using the ? operator, an entire group
can be made optional. For example, the following expression will capture either one
or two digits:
Pattern p = Pattern.compile("([0-9])([0-9])?");
Matcher m = p.matcher(str);
if (m.matches()) {
String digit1 = m.group(1);
String digit2 = m.group(2);
}
If str consists of two digits, then they will be captured as
digit1 and digit2 respectively. If the string only contains
one digit, then it will still match, and digit1 will contain that
digit. But digit2 will be null.
In general, optional groups will contain null if not present
in the string being matched.
Groups for the sake of pattern organisation
From the above, it may well have occurred to you that a group can be used
simply to organise a regular expression. For example, if we want to
make a particular sub-part optional, or to apply a choice to a particular
part of the expression.
If grouping is purely to organise the expression, then it is possible
to use what is called a non-capturing group
which avoids the overhead of capturing.
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.