SSH

sshregex

sshregex — regular expressions used in file name globbing with scpg3, sftpg3 and configuration files

Description

This document describes the regular expressions (or globbing patterns) used in file name globbing with scpg3 and sftpg3 and in the sshd2_config and ssh-socks-proxy-config.xml configuration files. Regex syntax used with scpg3 and sftpg3 is zsh_fileglob.

Regex syntax: egrep

Patterns

The escape character is a backslash (\). You can use it to escape meta characters to use them in their plain character form.

In the following examples literal 'E' and 'F' denote any expression, whether a pattern or a character.

(

Start a capturing subexpression.

)

End a capturing subexpression.

E|F

Disjunction, match either E or F (inclusive). E is preferred if both match.

E*

Act as Kleene star, match E zero or more times.

E+

Closure, match E one or more times.

E?

Option, match E optionally once.

.

Match any character except for newline characters (\n, \f, \r) and the NULL byte.

E{n}

Match E exactly n times.

E{n,} or E{n,0}

Match E n or more times.

E{,n} or E{0,n}

Match E at most n times.

E{n,m}

Match E no less than n times and no more than m times.

[

Start a character set, see the section called “Character Sets for egrep and zsh_fileglob.

$

Match the empty string at the end of the input or at the end of a line.

^

Match the empty string at the start of the input or at the beginning of a line.

Escaped Tokens for Regex Syntax Egrep

\0n..n

The literal byte with octal value n..n.

\0

The NULL byte.

\[1-9]..x

The literal byte with decimal value [1-9]..x.

\xn..n or \0xn..n

The literal byte with hexadecimal value n..n.

\<

Match the empty string at the beginning of a word.

\>

Match the empty string at the end of a word.

\b

Match the empty string at a word boundary.

\B

Match the empty string provided it is not at a word boundary.

\w

Match a word-constituent character, equivalent to [a:zA:Z0:9-].

\W

Match a non-word-constituent character.

\a

Literal alarm character.

\e

Literal escape character.

\f

Literal line feed.

\n

Literal new line, equivalent to C's \n so it can be more than one character long.

\r

Literal carriage return.

\t

Literal tab.

All other escaped characters denote the literal character itself.

Regex syntax: zsh_fileglob (or traditional)

Patterns

The escape character is a backslash '\'. With this you can escape meta characters to use them in their plain character form.

In the following examples literal 'E' and 'F' denote any expression, whether a pattern or a character.

*

Match any string consisting of zero or more characters. The characters can be any characters apart from slashes (/). However, the asterisk does not match a string if the string contains a dot (.) as its first character, or if the string contains a dot immediately after a slash. This means that the asterisk cannot be used to match file names that have a dot as their first character.

If the previous character is a slash (/), or if an asterisk (*) is used to denote a match at the beginning of a string, it does match a dot (.).

That is, the asterisk (*) functions as normal in Unix shell fileglobs.

?

Match any single character except for a slash (/). However, do not match a dot (.) if located at the beginning of the string, or if the previous character is a slash (/).

That is, the question mark (?) functions as normal in Unix shell fileglobs (at least in ZSH, although discarding the dot may not be a standard procedure).

**/

Match any sequence of characters that is either empty, or ends in a slash. However, the substring '/.' is not allowed. This mimics the **/ construct in ZSH. (Please note that '**' is equivalent to '*'.)

E#

Act as Kleene star, match E zero or more times.

E##

Closure, match E one or more times.

(

Start a capturing subexpression.

)

End a capturing subexpression.

E|F

Disjunction, match either E or F (inclusive). E is preferred if both match.

[

Start a character set. (see below)

Character Sets for egrep and zsh_fileglob

A character set starts with '[' and ends at non-escaped ']' that is not part of a POSIX character set specifier and that does not follow immediately after '['.

The following characters have a special meaning and need to be escaped if meant literally:

- (minus sign)

A range operator, except immediately after '[', where it loses its special meaning.

^ or ! (latter applies to ZSH_FILEGLOB)

If immediately after the starting '[', denotes a complement: the whole character set will be complemented. Otherwise literal '^'.

[:alnum:]

Characters for which 'isalnum' returns true (see ctype.h).

[:alpha:]

Characters for which 'isalpha' returns true (see ctype.h).

[:cntrl:]

Characters for which 'iscntrl' returns true (see ctype.h).

[:digit:]

Characters for which 'isdigit' returns true (see ctype.h).

[:graph:]

Characters for which 'isgraph' returns true (see ctype.h).

[:lower:]

Characters for which 'islower' returns true (see ctype.h).

[:print:]

Characters for which 'isprint' returns true (see ctype.h).

[:punct:]

Characters for which 'ispunct' returns true (see ctype.h).

[:space:]

Characters for which 'isspace' returns true (see ctype.h).

[:upper:]

Characters for which 'isupper' returns true (see ctype.h).

[:xdigit:]

Characters for which 'isxdigit' returns true (see ctype.h).

Example: [[:xdigit:]XY] is typically equivalent to [0123456789ABCDEFabcdefXY] .

It is also possible to include the predefined escaped character sets into a newly defined one, so [\d\s] matches digits and whitespace characters.

Also, escape sequences resulting in literals work inside character sets.

Regex syntax: ssh

Patterns

The escape character is a tilde '~'. With this you can escape meta characters to use them in their plain character form.

[Note]Note

In configuration the backslash '\' is used to escape the list separator (',').

In the following examples literal 'E' and 'F' denote any expression, whether a pattern or a character.

(

Start a capturing subexpression.

)

End a capturing subexpression.

{

Start anonymous, non-capturing subexpression.

}

End anonymous, non-capturing subexpression.

E|F

Disjunction, match either E or F (inclusive). E is preferred if both match.

E*

Act as Kleene star, match E zero or more times.

E*?

Act as Kleene star, but match non-greedily (lazy match).

E+

Closure, match E one or more times.

E+?

Closure, but match non-greedily (lazy match).

E?

Option, match E optionally once.

E??

Option, but match non-greedily (lazy match).

.

Match ANY character, including possibly the NULL byte and the newline characters.

E/n/

Match E exactly n times.

E/n,/ or E/n,0/

Match E n or more times.

E/,n/ or E/0,n/

Match E at most n times.

E/n,m/

Match E no less than n times and no more than m times.

E/n,/? , E/n,0/? , E/,n/? , E/0,n/? , E/n,m/?

The lazy versions of above.

[

Start a character set. See the section called “Character Sets for Regex Syntax ssh.

>C

One-character lookahead. 'C' must be either a literal character or parse as a character set. Match the empty string anywhere provided that the next character is 'C' or belongs to it.

<C

One-character lookback. As above, but examine the previous character instead of the next character.

$

Match the empty string at the end of the input.

^

Match the empty string at the start of the input.

Escaped Tokens for Regex Syntax ssh

~0n..n

The literal byte with octal value n..n .

~0

The NULL byte.

~[1-9]..x

The literal byte with decimal value [1-9]..x .

~xn..n or ~0xn..n

The literal byte with hexadecimal value n..n .

~<

Match the empty string at the beginning of a word.

~>

Match the empty string at the end of a word.

~b

Match the empty string at a word boundary.

~B

Match the empty string provided it is not at a word boundary.

~d

Match any digit, equivalent to [0:9].

~D

Match any character except a digit.

~s

Match a whitespace character (matches space, newline, line feed, carriage return, tab and vertical tab).

~S

Match a non-whitespace character.

~w

Match a word-constituent character, equivalent to [a:zA:Z0:9-].

~W

Match a non-word-constituent character.

~a

Literal alarm character.

~e

Literal escape character.

~f

Literal line feed.

~n

Literal new line, equivalent to C's \n so it can be more than one character long.

~r

Literal carriage return.

~t

Literal tab.

All other escaped characters denote the literal character itself.

Character Sets for Regex Syntax ssh

A character set starts with '[' and ends at non-escaped ']' that is not part of a POSIX character set specifier and that does not follow immediately after '['.

The following characters have a special meaning and need to be escaped if meant literally:

:

A range operator, except immediately after '[', where it loses its special meaning.

- (minus sign)

Until the next +, the characters, ranges, and sets will be subtracted from the current set instead of adding. If appears as the first character after '[', start subtracting from a set containing all characters instead of the empty set.

+

Until the next -, the characters, ranges, and sets will be added to the current set. This is the default.

[:alnum:]

Characters for which 'isalnum' returns true (see ctype.h).

[:alpha:]

Characters for which 'isalpha' returns true (see ctype.h).

[:cntrl:]

Characters for which 'iscntrl' returns true (see ctype.h).

[:digit:]

Characters for which 'isdigit' returns true (see ctype.h).

[:graph:]

Characters for which 'isgraph' returns true (see ctype.h).

[:lower:]

Characters for which 'islower' returns true (see ctype.h).

[:print:]

Characters for which 'isprint' returns true (see ctype.h).

[:punct:]

Characters for which 'ispunct' returns true (see ctype.h).

[:space:]

Characters for which 'isspace' returns true (see ctype.h).

[:upper:]

Characters for which 'isupper' returns true (see ctype.h).

[:xdigit:]

Characters for which 'isxdigit' returns true (see ctype.h).

It is also possible to include the predefined escaped character sets into a newly defined one, so [~d~s] matches digits and whitespace characters.

Also, escape sequences resulting in literals work inside character sets.

Example: [[:xdigit:]-a:e] is typically equivalent to [0123456789ABCDEFf] .