All Packages  Class Hierarchy  This Package  Previous  Next  Index

Class pat.Regex

java.lang.Object
   |
   +----pat.RegRes
           |
           +----pat.Regex

public class Regex
extends RegRes
implements FilenameFilter
Shareware: package pat Copyright 1996, Steven R. Brandt

For the purpose of this documentation, the fact that java interprets the backslash will be ignored. In practise, however, you will need a double backslash to obtain a string that contains a single backslash character. Thus, the example pattern "\b" should really be typed as "\\b" inside java code.

Note that Regex is part of package "pat". To use it, simply import pat.Regex at the top of your file.

Regex is made with a constructor that takes a String that defines the regular expression. Thus, for example

Regex r = new Regex("[a-c]*");
matches any number of characters so long as the are 'a', 'b', or 'c').

To attempt to match the Pattern to a given string, you can use either the search(String) member function, or the matchAt(String,int position) member function. These functions return a boolean which tells you whether or not the thing worked, and sets the methods "charsMatched()" and "matchFrom()" in the Regex object appropriately.

The portion of the string before the match can be obtained by the left() member, and the portion after the match can be obtained by the right() member.

Essentially, this package implements a syntax that is very much like the perl 5 regular expression syntax.

[a-c]
matches any character in the range 'a' to 'c'
[a-cde]
matches any character in the range 'a' to 'e'
[\-x]
matches the '-' character or the 'x' character
[^ab]
matches any characters except 'a' or 'b'
.
matches any character
{n1,n2}
matches between n1 and n2 instances of the previous Pattern, where n1 and n2 are integers.
{n1,}
matches at least n1 instances of the previous Pattern
*
same as {0,}
?
same as {0,1}
+
same as {1,}
*?
By defuault, {n1,n2}, {n1,}, *, ?, +, match the largest number of occurences of the preceeding Pattern. If any of these is followed by a ? it will attempt, instead, to match the fewest occurents of the preceeding Pattern.
(a|b)
matches the character a or b, and returns the matched character as a backreference.
(a)
matches the character "a" as a backreference.
Longer example:
Regex r = new Regex("x(a|b)y");
r.matchAt("xay",0);
System.out.println("sub = "+r.substring(0));
The above would print "sub = a".
r.left() // would return "x"
r.right() // would return "y"
.*\bfoo\b
\b matches a word boundary. Thus, the example matches "foo" "+foo+", but not "xfoo" or "foox".

Differences between this package and perl5:
The extended Pattern for setting flags, is now supported, but the flags are different. "(?i)" tells the pattern to ignore case, "(?Q)" sets the "dontMatchInQuotes" flag, and "(?iQ)" sets them both. You can change the escape character. The pattern

(?e=#)#d+
is the same as
\d+
, but note that the sequence
(?e=#)
must occur at the very beginning of the pattern.

This package supports additional patterns:

(?@())
This matches all characters between the '(' character and the ')' character, but is smart about it -- see next example
foo(?@[])
Matches both "foo[x]" and "foo[x[y]]"
(?< n1)
Moves the pointer backwards within the text. This allows you to make a "look behind." It fails if it attempts to move to a position before the beginning of the string.
x(?< 1)
is equivalent to (?=x).
(?< 1)\d\D(?< 1)
This Pattern matches a digit/non-digit boundary.


Variable Index

 o dontMatchInQuotes
You may now use the syntax "(?Q)" to tell Regex to not match in quotes.
 o esc
By default, the escape character is the backslash, but you can make it anything you want by setting this variable.
 o ignoreCase
You may now use the syntax "(?i)" or to tell Regex to ignore case in the pattern.

Constructor Index

 o Regex()
Initializes the object without a Pattern.
 o Regex(Regex)
Essentially clones the Regex object
 o Regex(String)
Create and compile a Regex, but do not throw any exceptions.

Method Index

 o accept(File, String)
This method implements FilenameFilter, allowing one to use a Regex to search through a directory using File.list
 o add(Pattern)
Only needed for creating your own extensions of Regex.
 o clone()
A clone by any other name would smell as sweet.
 o compile(String)
This method compiles a regular expression, making it possible to call the search or matchAt methods.
 o compile1(StrPos, Rthings)
You only need to use this method if you are creating your own extentions to Regex.
 o countMaxChars()
You only need to know about this if you are inventing your own pattern elements.
 o countMinChars()
You only need to know about this if you are inventing your own pattern elements.
 o matchAt(String, int)
Attempt to match a Pattern beginning at a specified location within the string.
 o optimize()
Once this method is called, the state of variables ignoreCase and dontMatchInQuotes should not be changed as the results will be unpredictable.
 o optimized()
This function returns true if the optimize method has been called.
 o result()
Return a clone of the underlying RegRes object.
 o search(String)
Search through a Pattern for the first occurrence of a match.
 o searchFrom(String, int)
Search through a Pattern for the first occurence of a match, but start at position
start
 o toString()
Converts the stored Pattern to a String
 o version()
The version of this package

Variables

 o dontMatchInQuotes
  public boolean dontMatchInQuotes
You may now use the syntax "(?Q)" to tell Regex to not match in quotes. Example of use:
// old way
Regex r = new Regex("a*b");
r.dontMatchInQuotes = true;
r.search("'ab'aab");
// r.charsMatched() now contains 3
// new way
Regex r2 = new Regex("(?Q)a*b");
r2.matchAt("'ab'aab");
// r2.charsMatched() now contains 3

 o ignoreCase
  public boolean ignoreCase
You may now use the syntax "(?i)" or to tell Regex to ignore case in the pattern. Example of use:
// old way
Regex r = new Regex("[a-c]");
r.ignoreCase = true;
r.search("BcA");
// r.charsMatched() now contains 3
// new way
Regex r2 = new Regex("(?i)[a-c]");
r2.search("BcA");
// r.charsMatched() now contains 3

 o esc
  public char esc
By default, the escape character is the backslash, but you can make it anything you want by setting this variable.

Constructors

 o Regex
  public Regex()
Initializes the object without a Pattern. To supply a Pattern use compile(String s).

See Also:
compile
 o Regex
  public Regex(String s)
Create and compile a Regex, but do not throw any exceptions. If you wish to have exceptions thrown for non-sensical regular expressions, you must use the Regex() constructor to create the Regex object, and then call the compile method. Therefore, you should only call this method when you know your pattern is right.

See Also:
Regex, search, compile
 o Regex
  public Regex(Regex r)
Essentially clones the Regex object

Methods

 o compile
  public void compile(String pat) throws RegSyntax
This method compiles a regular expression, making it possible to call the search or matchAt methods.

Throws: RegSyntax
is thrown a non-sense pattern is supplied. For example, x{3,1}.
See Also:
search, matchAt
 o clone
  public Object clone()
A clone by any other name would smell as sweet.

Overrides:
clone in class RegRes
 o result
  public RegRes result()
Return a clone of the underlying RegRes object.

 o matchAt
  public boolean matchAt(String s,
                         int start_pos)
Attempt to match a Pattern beginning at a specified location within the string.

See Also:
search
 o search
  public boolean search(String s)
Search through a Pattern for the first occurrence of a match.

See Also:
searchFrom, matchAt
 o searchFrom
  public boolean searchFrom(String s,
                            int start)
Search through a Pattern for the first occurence of a match, but start at position
start

 o add
  protected void add(Pattern p2)
Only needed for creating your own extensions of Regex. This method adds the next Pattern in the chain of patterns or sets the Pattern if it is the first call.

 o compile1
  protected void compile1(StrPos sp,
                          Rthings mk) throws RegSyntax
You only need to use this method if you are creating your own extentions to Regex. compile1 compiles one Pattern element, it can be over-ridden to allow the Regex compiler to understand new syntax. See deriv.java for an example. This routine is the heart of class Regex. Rthings has one integer member called intValue, it is used to keep track of the number of ()'s in the Pattern.

Throws: RegSyntax
is thrown when a nonsensensical pattern is supplied. For example, a pattern beginning with *.
 o toString
  public String toString()
Converts the stored Pattern to a String

Overrides:
toString in class RegRes
 o accept
  public boolean accept(File dir,
                        String s)
This method implements FilenameFilter, allowing one to use a Regex to search through a directory using File.list

 o version
  public String version()
The version of this package

 o optimize
  public void optimize()
Once this method is called, the state of variables ignoreCase and dontMatchInQuotes should not be changed as the results will be unpredictable. However, search and matchAt will run more quickly. Note that you can check to see if the pattern has been optimized by calling the optimized() method.

See Also:
optimized, ignoreCase, dontMatchInQuotes, matchAt, search, ThreadBufReader
 o optimized
  public boolean optimized()
This function returns true if the optimize method has been called.

 o countMinChars
  public patInt countMinChars()
You only need to know about this if you are inventing your own pattern elements.

 o countMaxChars
  public patInt countMaxChars()
You only need to know about this if you are inventing your own pattern elements.


All Packages  Class Hierarchy  This Package  Previous  Next  Index