RegexPattern Class
A class that holds regular expression patterns.
The Compile static method compiles the specified regular expression and creates a RegexPattern object. The generated RegexPattern object is used to generate an object of the RegexMatcher class, which is a regular expression engine for matching with any input string.
You can create multiple RegexMathcer objects from a single RegexPattern object. Each RegexMatcher object shares the same regular expression pattern.
Default properties and ValueType
The default property is Pattern. The ValueType specification is invalid.
A Unicode mode has been added that handles more Unicode strings. ->
About Unicode mode
If you specify RegexPattern.Unicode as an argument of Compile method and Matches method, you can generate a regular expression engine in Unicode mode.
In Unicode mode, regular expressions and input strings are treated as UString type, and Unicode characters can be used.
The index that represents the position of the character (Start of RegexMatcher class, result of End method, etc.) is in character units (byte units when not in Unicode mode).
<-Up to here
Typical usage
- Compile the regular expression with the Compile static method to create a RegexPattern object.
- Create a RegexMatcher object that matches the regular expression pattern and the input string by the Matcher method of the RegexPattern object generated in 1.
- Match with the Matches method, Find method, etc. of the RegexMatcher object generated in 2.
/* Compile the regular expression */
var p = RegexPattern.Compile("Biz/([a-zA-Z]+)");
/* Set the input string and generate a regular expression engine */
var m = p.Matcher("Biz/Browser, Biz/Designer");
/* Perform a partial match search */
while (m.Find()) {
/* Show the entire partially matched part */
print(m.Group());
/* Show first forward reference group */
print(" [", m.Group(1) , "]", "\n");
}
/* Fully matched */
print("Matches:", m.Matches(), "\n");
/* Match from the beginning */
print("LookingAt:", m.LookingAt(), "\n");
----- Execution result -----
Biz/Browser [ Browser ]
Biz/Designer [ Designer ]
Matches: 0
LookingAt: 1
Regular expression syntax summary
You can use Perl-like syntax for regular expressions.
- PCRE (Perl Compatible Regular Expressions) is used in the regular expression class.
Metacharacters
| \ | Quote the meta character immediately after |
| ^ | Matches the beginning of a line. The beginning of a line in multi-line mode. |
| . | Matches any character (except line breaks). |
| $ | Matches the end of a line. End of line in multi-line mode. |
| | | Selection |
| () | Grouping |
| [] | Character class |
Metacharacters that can be used in character classes
| \ | Quote the meta character immediately after |
| ^ | Negate the class only when used for the first character |
| - | Character range ※ |
※ The regular expression of Biz / Browser uses UTF-8 as the character code of the internal data. When specifying the character range, the UTF-8 character code is used to determine the range. (It is always UTF-8 regardless of the Unicode mode specification.) Be careful when specifying characters such as “Kanji” in the range.
Binary character
| \t | Tab |
| \n | New line |
| \r | Return |
| \f | New page |
| \a | Alarm (bell) |
| \e | Escape |
| \033 | Eighth character |
| \x1B | Hexadecimal character |
| \c[ | Control character |
| \E | Ends quoting of regular expression operators started with \Q |
| \Q | Treats all special characters up to \E as normal characters |
General character
| \w | Matches “word” characters (alphabets, numbers, “_”) |
| \W | Matches non-word characters |
| \s | Matches whitespace characters |
| \S | Matches non-blank characters |
| \d | Matches numbers |
| \D | Matches non-numbers |
Position specifier
| \b | Matches word boundaries |
| \B | Matches other than word boundaries |
| \A | Matches only at the beginning of the string |
| \Z | Matches only at the end of the string or just before the newline at the end |
| \z | Matches only at the end of the string |
| \G | Matches the search start position |
Character class[:class:]
| alnum | alphanumeric |
| alpha | english alphabet |
| ascii | character code 0 - 127 |
| blank | Blank or tab |
| cntrl | Control character |
| digit | 10-decimal digits |
| graph | Excluding displayable character spaces |
| lower | Lowercase letters |
| Indicates possible text | |
| punct | Punctuation characters |
| space | Space characters |
| upper | Uppercase characters |
| word | Matches “word” characters (alphabets, numbers, “_”) |
| xdigit | 16 digits |
Quantum specifier
| * | Matches zero or more iterations |
| + | Matches more than once |
| ? | Matches zero or one iteration |
| {n} | Matches n iterations |
| {n,} | Matches at least n repeats |
| {n,m} | Matches repeats between n and m times |