groff 1.23.0 added .MR to its -man macro package. The NEWS file states
that the inclusion of the macro "was prompted by its introduction to
Plan 9 from User Space's troff in August 2020." From d32deab it seems
that the name for Plan 9 from User Space's implementation was suggested
by groff maintainer G. Brandon Robinson.
Not sure if the intention was to make these definitions compatible, but
it would be nice if they were.
Currently, Plan 9 from User Space's .MR expects its second argument to
be parenthesized. groff's .MR does not. This results in extra
parentheses appearing in manual references when viewing Plan 9 from User
Space's manual pages on a system using groff.
134 lines
2.1 KiB
Groff
134 lines
2.1 KiB
Groff
.TH REGEXP 7
|
|
.SH NAME
|
|
regexp \- Plan 9 regular expression notation
|
|
.SH DESCRIPTION
|
|
This manual page describes the regular expression
|
|
syntax used by the Plan 9 regular expression library
|
|
.MR regexp 3 .
|
|
It is the form used by
|
|
.MR egrep 1
|
|
before
|
|
.I egrep
|
|
got complicated.
|
|
.PP
|
|
A
|
|
.I "regular expression"
|
|
specifies
|
|
a set of strings of characters.
|
|
A member of this set of strings is said to be
|
|
.I matched
|
|
by the regular expression. In many applications
|
|
a delimiter character, commonly
|
|
.LR / ,
|
|
bounds a regular expression.
|
|
In the following specification for regular expressions
|
|
the word `character' means any character (rune) but newline.
|
|
.PP
|
|
The syntax for a regular expression
|
|
.B e0
|
|
is
|
|
.IP
|
|
.EX
|
|
e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')'
|
|
|
|
e2: e3
|
|
| e2 REP
|
|
|
|
REP: '*' | '+' | '?'
|
|
|
|
e1: e2
|
|
| e1 e2
|
|
|
|
e0: e1
|
|
| e0 '|' e1
|
|
.EE
|
|
.PP
|
|
A
|
|
.B literal
|
|
is any non-metacharacter, or a metacharacter
|
|
(one of
|
|
.BR .*+?[]()|\e^$ ),
|
|
or the delimiter
|
|
preceded by
|
|
.LR \e .
|
|
.PP
|
|
A
|
|
.B charclass
|
|
is a nonempty string
|
|
.I s
|
|
bracketed
|
|
.BI [ \|s\| ]
|
|
(or
|
|
.BI [^ s\| ]\fR);
|
|
it matches any character in (or not in)
|
|
.IR s .
|
|
A negated character class never
|
|
matches newline.
|
|
A substring
|
|
.IB a - b\f1,
|
|
with
|
|
.I a
|
|
and
|
|
.I b
|
|
in ascending
|
|
order, stands for the inclusive
|
|
range of
|
|
characters between
|
|
.I a
|
|
and
|
|
.IR b .
|
|
In
|
|
.IR s ,
|
|
the metacharacters
|
|
.LR - ,
|
|
.LR ] ,
|
|
an initial
|
|
.LR ^ ,
|
|
and the regular expression delimiter
|
|
must be preceded by a
|
|
.LR \e ;
|
|
other metacharacters
|
|
have no special meaning and
|
|
may appear unescaped.
|
|
.PP
|
|
A
|
|
.L .
|
|
matches any character.
|
|
.PP
|
|
A
|
|
.L ^
|
|
matches the beginning of a line;
|
|
.L $
|
|
matches the end of the line.
|
|
.PP
|
|
The
|
|
.B REP
|
|
operators match zero or more
|
|
.RB ( * ),
|
|
one or more
|
|
.RB ( + ),
|
|
zero or one
|
|
.RB ( ? ),
|
|
instances respectively of the preceding regular expression
|
|
.BR e2 .
|
|
.PP
|
|
A concatenated regular expression,
|
|
.BR "e1\|e2" ,
|
|
matches a match to
|
|
.B e1
|
|
followed by a match to
|
|
.BR e2 .
|
|
.PP
|
|
An alternative regular expression,
|
|
.BR "e0\||\|e1" ,
|
|
matches either a match to
|
|
.B e0
|
|
or a match to
|
|
.BR e1 .
|
|
.PP
|
|
A match to any part of a regular expression
|
|
extends as far as possible without preventing
|
|
a match to the remainder of the regular expression.
|
|
.SH "SEE ALSO"
|
|
.MR regexp 3
|