Tester's Story.

Regular Expressions (Legacy)

Regular expressions allow more complex search and replace functions to be performed in a single operation.


There are two possible sets of legacy syntax that may be used.  The first table below shows the original UltraEdit syntax used in earlier versions of UltraEdit.  The second table shows the optional "Unix" style regular expressions.  This may be enabled from the Configuration Section.


Regular Expressions (UltraEdit Syntax):


Symbol

Function

%

Matches the start of line - Indicates the search string must be at the beginning of a line but does not include any line terminator characters in the resulting string selected.

$

Matches the end of line - Indicates the search string must be at the end of line but does not include any line terminator characters in the resulting string selected.

?

Matches any single character except newline.

*

Matches any number of occurrences of any character except newline.

+

Matches one or more of the preceding character/expression.  At least one occurrence of the character must be found.  Does not match repeated newlines.

++

Matches the preceding character/expression zero or more times.  Does not match repeated newlines.

^b

Matches a page break.

^p

Matches a newline (CR/LF) (paragraph) (DOS Files)

^r

Matches a newline (CR Only) (paragraph) (MAC Files)

^n

Matches a newline (LF Only) (paragraph) (UNIX Files)

^t

Matches a tab character

[xyz]

A character set.  Matches any characters between brackets.

[~xyz]

A negative character set.  Matches any characters NOT between brackets including newline characters.

^{A^}^{B^}

Matches expression A OR B

^

Overrides the following regular expression character

^(…^)  

Brackets or tags an expression to use in the replace command.  A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression.
 

The corresponding replacement expression is ^x, for x in the range 1-9.  Example: If ^(h*o^) ^(f*s^) matches "hello folks", ^2 ^1 would replace it with "folks hello".


Note - ^ refers to the character '^' NOT Control Key + value.


Examples:

m?n matches "man", "men", "min" but not "moon".


t*t matches "test", "tonight" and "tea time" (the "tea t" portion) but not "tea

time" (newline between "tea " and "time").


Te+st matches "test", "teest", "teeeest" etc. but does not match "tst".


[aeiou] matches every lowercase vowel

[,.?] matches a literal ",", "." or "?".

[0-9a-z] matches any digit, or lowercase letter

[~0-9] matches any character except a digit (~ means NOT the following)


You may search for an expression A or B as follows:


"^{John^}^{Tom^}”


This will search for an occurrence of John or Tom.  There should be nothing between the two expressions.


You may combine A or B and C or D in the same search as follows:


"^{John^}^{Tom^} ^{Smith^}^{Jones^}"


This will search for John or Tom followed by Smith or Jones.


The table below shows the syntax for the "Unix" style regular expressions.


Regular Expressions (Unix Syntax):


Symbol

Function

\

Indicates the next character has a special meaning.  "n" on it’s own matches the character "n".  "\n" matches a linefeed or newline character.  See examples below (\d, \f, \n etc).

^

Matches/anchors the beginning of line.

$

Matches/anchors the end of line.

*

Matches the preceding character zero or more times.

+

Matches the preceding character one or more times.  Does not match repeated newlines.

.

Matches any single character except a newline character.  Does not match repeated newlines.

(expression)

Brackets or tags an expression to use in the replace command.  A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression.
 

The corresponding replacement expression is \x, for x in the range 1-9.  Example: If (h.*o) (f.*s) matches "hello folks", \2 \1 would replace it with "folks hello".

[xyz]

A character set.  Matches any characters between brackets.

[^xyz]

A negative character set.  Matches any characters NOT between brackets including newline characters.

\d

Matches a digit character.  Equivalent to [0-9].

\D

Matches a nondigit character.  Equivalent to [^0-9].

\f

Matches a form-feed character.

\n

Matches a linefeed character.

\r

Matches a carriage return character.

\s

Matches any whitespace including space, tab, form-feed, etc but not newline.

\S

Matches any non-whitespace character but not newline.

\t

Matches a tab character.

\v

Matches a vertical tab character.

\w

Matches any word character including underscore.

\W

Matches any nonword character.

\p

Matches CR/LF (same as \r\n) to match a DOS line terminator.


Note - ^ refers to the character '^' NOT Control Key + value.


Examples:

m.n matches "man", "men", "min" but not "moon".


Te+st matches "test", "teest", "teeeest" etc. BUT NOT "tst".


Te*st matches "test", "teest", "teeeest" etc. AND "tst".


[aeiou] matches every lowercase vowel

[,.?] matches a literal ",", "." or "?".

[0-9a-z] matches any digit, or lowercase letter

[^0-9] matches any character except a digit (^ means NOT the following)


You may search for an expression A or B as follows:


"(John|Tom)"


This will search for an occurrence of John or Tom.  There should be nothing between the two expressions.


You may combine A or B and C or D in the same search as follows:


"(John|Tom) (Smith|Jones)"



This will search for John or Tom followed by Smith or Jones.


If Regular Expressions is not selected (i.e. no usage of Regular Expressions is active) for a find/replace the following special characters are also valid in the Find and Replace fields:


Symbol

Function

^^

Matches a "^" character

^s

Is substituted with the selected (highlighted) text of the active file window.

^c

Is substituted with the contents of the clipboard.

^b

Matches a page break

^p

Matches a newline (CR/LF) (paragraph) (DOS Files)

^r

Matches a newline (CR Only) (paragraph) (MAC Files)

^n

Matches a newline (LF Only) (paragraph) (UNIX Files)

^t

Matches a tab character


Note - ^ refers to the character '^' NOT Control Key + value.

For information regarding Perl Compatible Regular Expressions please click here.

Posted by Tester