Basic regular expressions with String.matches()
On this page we'll look at how to form a basic regular expression and
how to test if a string matches the expression. As our example, we'll consider
a case fairly typical in data conversion or data cleansing applications:
- we want to say if a given string represents the value "true", and return
a boolean value accordingly;
- but we need to be flexible in what string values we consider to
mean "true" (e.g. the user could have entered "yes", could have capitalised the word etc).
Basics of regular expressions
A regular expression is a sequence of characters that we want to match
against. The first general notion is that:
- a "normal" character matches against itself.
By "normal", we mean excluding a few characters that have special meanings. We'll
introduce these as we go along.
In Java, the easiest way to see if a given string matches a particular regular
expression is to use the matches() method, passing in the expression. For example,
here is how we would check if a string matched the regular expression true:
public boolean isTrueValue(String str) {
return str.matches("true");
}
Since each character of the regular expression matches against itself, and we have
no "special" characters in our expression, this effectively
means that the string matches when (and only when) it equals the string "true".
In other words, in this particular example, we could have written the following:
public boolean isTrueValue(String str) {
return str.equals("true");
}
Character classes ("character choices")
OK, so a regular expression with just "normal" characters isn't very interesting.
But now for a more interesting example:
- If we put several characters inside
square brackets– [...]– this means a choice between
characters.
Technically, the choice is called a character class.
So if we write [tT], that means "either lower or upper
case T". So to accept the values true or True we can write the
following:
public boolean isTrueValue(String str) {
return str.matches("[Tt]rue");
}
Alternatives
The square brackets are useful when we want a choice for a single character.
When we want to match alternatives for a whole string, we instead
put a pipe character– |– between the alternatives:
public boolean isTrueValue(String str) {
return str.matches("true|yes");
}
The above expression will match either true or yes.
To make it additionally match True and Yes, we can combine the two
techniques:
public boolean isTrueValue(String str) {
return str.matches("[Tt]rue|[Yy]es");
}
On the next page, we continue by looking in more detail at character classes, with features such as matching against a range of characters.
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.