Regular expressions are a series
of characters that define a pattern of text. Regular expressions
are used in Workbench for expressing search criteria.
Regular expressions are useful when you need to search for text
but you do not want to specify the entire content of the text that
is returned. Regular expressions are useful in the following situations:
-
You only know a portion of the text that you are searching
for. For example, if you are searching for files, you may only know
the first three letters used to start the file name.
-
You need to find multiple instances of text that include
a certain pattern. For example, if you want to find all of the files
that have the same file extension.
When constructing regular expressions, you need to use the correct
syntax:
Literal characters
The base components
of regular expressions are literal characters. For example, the
regular expression
abc
matches the sequence
abc
.
Groups
Enclosing
a sequence of characters in parentheses forms a group, for example
(defg)
.
The characters in a group are treated as a single element.
Any character
The period (
.
) represents any
character. For example
a.bc
matches the sequence
of
a
followed by any character followed by
bc,
such
as
ambc
.
The OR operator
To specify a pattern that includes either
one of two characters, you separate them with the pipe character
(|). For example,
x|y
matches either the character
x
or
y
. The
regular expression
x|(zy)
matches either the character
x
or
the sequence
zy
.
Quantifiers
Quantifiers are special characters used
to express a quantity. Quantifiers are applied to the element that
directly precedes the quantifier. For example, when a group precedes
a quantifier, the quantifier applies to the characters in the group
as a whole.
Character
|
Quality
|
Example
|
?
|
Zero or one
|
a?
matches a sequence of
zero or one
a
characters.
abc?
matches
a sequence of
ab
followed by zero or one
c
characters,
such as
abc
or
ab
|
*
|
Zero or more
|
a*b|c
matches a sequence
of zero or more
a
characters followed by either
b
or
c
,
such as
aaaac
.
|
+
|
One or more
|
z+
matches a sequence of
one or more
z
characters.
|
{n}
|
n times
|
efg{3}b
matches a sequence
of
ef
followed by three consecutive
g
characters
followed by another character, such as
efgggb
.
|
{n,}
|
n or more
|
a{3,}
matches a sequence
of three or more consecutive
a
characters.
|
{n,m}
|
n or more, but no more than m
|
a{3,5}
matches a sequence
of three, four, or five consecutive
a
characters.
|
Note:
The backslash character (\) is used for escaping
special characters so that you can express them as literal characters.
For example, \+ matches the plus sign (+). To specify the backslash
character (\) in the path on a Windows file system, you use \\ as follows:
c:\\file.txt
Character classes
Character classes are
a group of characters. You define character classes by enclosing
characters and special characters within brackets ([ ]).
Character
|
Explanation
|
Example
|
Literals
|
The characters that you want to include
in the character class
|
[abc]
specifies
a
or
b
or
c
.
|
^
|
The negation of characters.
|
[^abc]
specifies any character
except
a
,
b
, and
c
.
|
-
|
An inclusive range of characters.
|
[a-z]
specifies all characters
between a and z, including
a
and
z
.
[a-zA-Z]
specifies
all characters between
a
and
z
,
inclusive (including
a
and
z
),
as well as all characters between
A
and
Z
,
inclusive.
|
nested brackets
|
Specifies the union of the character class
that is defined by the nested bracket.
|
[a-d[m-p]]
specifies all
characters between
a
and
d
, inclusive,
as well as all characters between
m
and
p
,
inclusive. Syntactically equivalent to
[a-dm-p]
.
|
&&
|
Specifies the intersection of characters
with a character class.
|
[a-z&&[def]]
specifies
the intersection of the character range
a-z
with
the character class that includes
def
. This example
regular expression evaluates to the characters
d
,
e
,
or
f
.
|
Predefined character classes
Several predefined character classes
can be expressed using shorthand.
Character class
|
Shorthand
|
Description
|
[0-9]
|
\d
|
A digit.
|
[^0-9]
|
\D
|
Any character that is not a digit.
|
\\x
|
\s
|
A white space character.
|
[^\s]
|
\S
|
Any character that is not a white space
character.
|
[a-zA-Z_0-9]
|
\w
|
A character that represents a letter of
the alphabet
|
[^\w]
|
\W
|
Any character that is not a letter of the
alphabet.
|
|
|
|