httpd-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pgollu...@apache.org
Subject svn commit: r598343 [14/22] - in /httpd/httpd/vendor/pcre/current: ./ doc/ doc/html/ testdata/
Date Mon, 26 Nov 2007 17:04:37 GMT
Modified: httpd/httpd/vendor/pcre/current/doc/pcrecompat.3
URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/doc/pcrecompat.3?rev=598343&r1=598342&r2=598343&view=diff
==============================================================================
--- httpd/httpd/vendor/pcre/current/doc/pcrecompat.3 (original)
+++ httpd/httpd/vendor/pcre/current/doc/pcrecompat.3 Mon Nov 26 09:04:19 2007
@@ -1,16 +1,15 @@
-.TH PCRECOMPAT 3
+.TH PCRE 3
 .SH NAME
 PCRE - Perl-compatible regular expressions
 .SH "DIFFERENCES BETWEEN PCRE AND PERL"
 .rs
 .sp
 This document describes the differences in the ways that PCRE and Perl handle
-regular expressions. The differences described here are mainly with respect to
-Perl 5.8, though PCRE versions 7.0 and later contain some features that are
-expected to be in the forthcoming Perl 5.10.
+regular expressions. The differences described here are with respect to Perl
+5.8.
 .P
-1. PCRE has only a subset of Perl's UTF-8 and Unicode support. Details of what
-it does have are given in the
+1. PCRE does not have full UTF-8 support. Details of what it does have are
+given in the
 .\" HTML <a href="pcre.html#utf8support">
 .\" </a>
 section on UTF-8 support
@@ -45,8 +44,7 @@
 6. The Perl escape sequences \ep, \eP, and \eX are supported only if PCRE is
 built with Unicode character property support. The properties that can be
 tested with \ep and \eP are limited to the general category properties such as
-Lu and Nd, script names such as Greek or Han, and the derived properties Any
-and L&.
+Lu and Nd.
 .P
 7. PCRE does support the \eQ...\eE escape for quoting substrings. Characters in
 between are treated as literals. This is slightly different from Perl in that $
@@ -64,32 +62,20 @@
 .sp
 The \eQ...\eE sequence is recognized both inside and outside character classes.
 .P
-8. Fairly obviously, PCRE does not support the (?{code}) and (??{code})
-constructions. However, there is support for recursive patterns. This is not
-available in Perl 5.8, but will be in Perl 5.10. Also, the PCRE "callout"
-feature allows an external function to be called during pattern matching. See
-the
+8. Fairly obviously, PCRE does not support the (?{code}) and (?p{code})
+constructions. However, there is support for recursive patterns using the
+non-Perl items (?R), (?number), and (?P>name). Also, the PCRE "callout" feature
+allows an external function to be called during pattern matching. See the
 .\" HREF
 \fBpcrecallout\fP
 .\"
 documentation for details.
 .P
-9. Subpatterns that are called recursively or as "subroutines" are always
-treated as atomic groups in PCRE. This is like Python, but unlike Perl.
-.P
-10. There are some differences that are concerned with the settings of captured
+9. There are some differences that are concerned with the settings of captured
 strings when part of a pattern is repeated. For example, matching "aba" against
 the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE it is set to "b".
 .P
-11. PCRE does support Perl 5.10's backtracking verbs (*ACCEPT), (*FAIL), (*F),
-(*COMMIT), (*PRUNE), (*SKIP), and (*THEN), but only in the forms without an
-argument. PCRE does not support (*MARK). If (*ACCEPT) is within capturing
-parentheses, PCRE does not set that capture group; this is different to Perl.
-.P
-12. PCRE provides some extensions to the Perl regular expression facilities.
-Perl 5.10 will include new features that are not in earlier versions, some of
-which (such as named parentheses) have been in PCRE for some time. This list is
-with respect to Perl 5.10:
+10. PCRE provides some extensions to the Perl regular expression facilities:
 .sp
 (a) Although lookbehind assertions must match fixed length strings, each
 alternative branch of a lookbehind assertion can match a different length of
@@ -99,8 +85,7 @@
 meta-character matches only at the very end of the string.
 .sp
 (c) If PCRE_EXTRA is set, a backslash followed by a letter with no special
-meaning is faulted. Otherwise, like Perl, the backslash is quietly ignored.
-(Perl can be made to issue a warning.)
+meaning is faulted.
 .sp
 (d) If PCRE_UNGREEDY is set, the greediness of the repetition quantifiers is
 inverted, that is, by default they are not greedy, but if followed by a
@@ -112,37 +97,25 @@
 (f) The PCRE_NOTBOL, PCRE_NOTEOL, PCRE_NOTEMPTY, and PCRE_NO_AUTO_CAPTURE
 options for \fBpcre_exec()\fP have no Perl equivalents.
 .sp
-(g) The \eR escape sequence can be restricted to match only CR, LF, or CRLF
-by the PCRE_BSR_ANYCRLF option.
+(g) The (?R), (?number), and (?P>name) constructs allows for recursive pattern
+matching (Perl can do this using the (?p{code}) construct, which PCRE cannot
+support.)
 .sp
-(h) The callout facility is PCRE-specific.
+(h) PCRE supports named capturing substrings, using the Python syntax.
 .sp
-(i) The partial matching facility is PCRE-specific.
+(i) PCRE supports the possessive quantifier "++" syntax, taken from Sun's Java
+package.
 .sp
-(j) Patterns compiled by PCRE can be saved and re-used at a later time, even on
-different hosts that have the other endianness.
+(j) The (R) condition, for testing recursion, is a PCRE extension.
 .sp
-(k) The alternative matching function (\fBpcre_dfa_exec()\fP) matches in a
-different way and is not Perl-compatible.
+(k) The callout facility is PCRE-specific.
 .sp
-(l) PCRE recognizes some special sequences such as (*CR) at the start of
-a pattern that set overall options that cannot be changed within the pattern.
-.
-.
-.SH AUTHOR
-.rs
+(l) The partial matching facility is PCRE-specific.
 .sp
-.nf
-Philip Hazel
-University Computing Service
-Cambridge CB2 3QH, England.
-.fi
-.
-.
-.SH REVISION
-.rs
-.sp
-.nf
-Last updated: 11 September 2007
-Copyright (c) 1997-2007 University of Cambridge.
-.fi
+(m) Patterns compiled by PCRE can be saved and re-used at a later time, even on
+different hosts that have the other endianness.
+.P
+.in 0
+Last updated: 09 September 2004
+.br
+Copyright (c) 1997-2004 University of Cambridge.

Modified: httpd/httpd/vendor/pcre/current/doc/pcregrep.1
URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/doc/pcregrep.1?rev=598343&r1=598342&r2=598343&view=diff
==============================================================================
--- httpd/httpd/vendor/pcre/current/doc/pcregrep.1 (original)
+++ httpd/httpd/vendor/pcre/current/doc/pcregrep.1 Mon Nov 26 09:04:19 2007
@@ -2,7 +2,7 @@
 .SH NAME
 pcregrep - a grep with Perl-compatible regular expressions.
 .SH SYNOPSIS
-.B pcregrep [options] [long options] [pattern] [path1 path2 ...]
+.B pcregrep [-Vcfhilnrsuvx] [long options] [pattern] [file1 file2 ...]
 .
 .SH DESCRIPTION
 .rs
@@ -11,383 +11,120 @@
 grep commands do, but it uses the PCRE regular expression library to support
 patterns that are compatible with the regular expressions of Perl 5. See
 .\" HREF
-\fBpcrepattern\fP(3)
+\fBpcrepattern\fP
 .\"
-for a full description of syntax and semantics of the regular expressions
-that PCRE supports.
+for a full description of syntax and semantics of the regular expressions that
+PCRE supports.
 .P
-Patterns, whether supplied on the command line or in a separate file, are given
-without delimiters. For example:
-.sp
-  pcregrep Thursday /etc/motd
-.sp
-If you attempt to use delimiters (for example, by surrounding a pattern with
-slashes, as is common in Perl scripts), they are interpreted as part of the
-pattern. Quotes can of course be used on the command line because they are
-interpreted by the shell, and indeed they are required if a pattern contains
-white space or shell metacharacters.
-.P
-The first argument that follows any option settings is treated as the single
-pattern to be matched when neither \fB-e\fP nor \fB-f\fP is present.
-Conversely, when one or both of these options are used to specify patterns, all
-arguments are treated as path names. At least one of \fB-e\fP, \fB-f\fP, or an
-argument pattern must be provided.
-.P
-If no files are specified, \fBpcregrep\fP reads the standard input. The
-standard input can also be referenced by a name consisting of a single hyphen.
-For example:
-.sp
-  pcregrep some-pattern /file1 - /file3
-.sp
-By default, each line that matches the pattern is copied to the standard
-output, and if there is more than one file, the file name is output at the
-start of each line. However, there are options that can change how
-\fBpcregrep\fP behaves. In particular, the \fB-M\fP option makes it possible to
-search for patterns that span line boundaries. What defines a line boundary is
-controlled by the \fB-N\fP (\fB--newline\fP) option.
-.P
-Patterns are limited to 8K or BUFSIZ characters, whichever is the greater.
-BUFSIZ is defined in \fB<stdio.h>\fP.
+A pattern must be specified on the command line unless the \fB-f\fP option is
+used (see below).
 .P
-If the \fBLC_ALL\fP or \fBLC_CTYPE\fP environment variable is set,
-\fBpcregrep\fP uses the value to set a locale when calling the PCRE library.
-The \fB--locale\fP option can be used to override this.
+If no files are specified, \fBpcregrep\fP reads the standard input. By default,
+each line that matches the pattern is copied to the standard output, and if
+there is more than one file, the file name is printed before each line of
+output. However, there are options that can change how \fBpcregrep\fP behaves.
+.P
+Lines are limited to BUFSIZ characters. BUFSIZ is defined in \fB<stdio.h>\fP.
+The newline character is removed from the end of each line before it is matched
+against the pattern.
 .
 .SH OPTIONS
 .rs
+.sp
 .TP 10
-\fB--\fP
-This terminate the list of options. It is useful if the next item on the
-command line starts with a hyphen but is not an option. This allows for the
-processing of patterns and filenames that start with hyphens.
-.TP
-\fB-A\fP \fInumber\fP, \fB--after-context=\fP\fInumber\fP
-Output \fInumber\fP lines of context after each matching line. If filenames
-and/or line numbers are being output, a hyphen separator is used instead of a
-colon for the context lines. A line containing "--" is output between each
-group of lines, unless they are in fact contiguous in the input file. The value
-of \fInumber\fP is expected to be relatively small. However, \fBpcregrep\fP
-guarantees to have up to 8K of following text available for context output.
-.TP
-\fB-B\fP \fInumber\fP, \fB--before-context=\fP\fInumber\fP
-Output \fInumber\fP lines of context before each matching line. If filenames
-and/or line numbers are being output, a hyphen separator is used instead of a
-colon for the context lines. A line containing "--" is output between each
-group of lines, unless they are in fact contiguous in the input file. The value
-of \fInumber\fP is expected to be relatively small. However, \fBpcregrep\fP
-guarantees to have up to 8K of preceding text available for context output.
-.TP
-\fB-C\fP \fInumber\fP, \fB--context=\fP\fInumber\fP
-Output \fInumber\fP lines of context both before and after each matching line.
-This is equivalent to setting both \fB-A\fP and \fB-B\fP to the same value.
-.TP
-\fB-c\fP, \fB--count\fP
-Do not output individual lines; instead just output a count of the number of
-lines that would otherwise have been output. If several files are given, a
-count is output for each of them. In this mode, the \fB-A\fP, \fB-B\fP, and
-\fB-C\fP options are ignored.
-.TP
-\fB--colour\fP, \fB--color\fP
-If this option is given without any data, it is equivalent to "--colour=auto".
-If data is required, it must be given in the same shell item, separated by an
-equals sign.
-.TP
-\fB--colour=\fP\fIvalue\fP, \fB--color=\fP\fIvalue\fP
-This option specifies under what circumstances the part of a line that matched
-a pattern should be coloured in the output. The value may be "never" (the
-default), "always", or "auto". In the latter case, colouring happens only if
-the standard output is connected to a terminal. The colour can be specified by
-setting the environment variable PCREGREP_COLOUR or PCREGREP_COLOR. The value
-of this variable should be a string of two numbers, separated by a semicolon.
-They are copied directly into the control string for setting colour on a
-terminal, so it is your responsibility to ensure that they make sense. If
-neither of the environment variables is set, the default is "1;31", which gives
-red.
-.TP
-\fB-D\fP \fIaction\fP, \fB--devices=\fP\fIaction\fP
-If an input path is not a regular file or a directory, "action" specifies how
-it is to be processed. Valid values are "read" (the default) or "skip"
-(silently skip the path).
-.TP
-\fB-d\fP \fIaction\fP, \fB--directories=\fP\fIaction\fP
-If an input path is a directory, "action" specifies how it is to be processed.
-Valid values are "read" (the default), "recurse" (equivalent to the \fB-r\fP
-option), or "skip" (silently skip the path). In the default case, directories
-are read as if they were ordinary files. In some operating systems the effect
-of reading a directory like this is an immediate end-of-file.
-.TP
-\fB-e\fP \fIpattern\fP, \fB--regex=\fP\fIpattern\fP,
-\fB--regexp=\fP\fIpattern\fP Specify a pattern to be matched. This option can
-be used multiple times in order to specify several patterns. It can also be
-used as a way of specifying a single pattern that starts with a hyphen. When
-\fB-e\fP is used, no argument pattern is taken from the command line; all
-arguments are treated as file names. There is an overall maximum of 100
-patterns. They are applied to each line in the order in which they are defined
-until one matches (or fails to match if \fB-v\fP is used). If \fB-f\fP is used
-with \fB-e\fP, the command line patterns are matched first, followed by the
-patterns from the file, independent of the order in which these options are
-specified. Note that multiple use of \fB-e\fP is not the same as a single
-pattern with alternatives. For example, X|Y finds the first character in a line
-that is X or Y, whereas if the two patterns are given separately,
-\fBpcregrep\fP finds X if it is present, even if it follows Y in the line. It
-finds Y only if there is no X in the line. This really matters only if you are
-using \fB-o\fP to show the portion of the line that matched.
-.TP
-\fB--exclude\fP=\fIpattern\fP
-When \fBpcregrep\fP is searching the files in a directory as a consequence of
-the \fB-r\fP (recursive search) option, any files whose names match the pattern
-are excluded. The pattern is a PCRE regular expression. If a file name matches
-both \fB--include\fP and \fB--exclude\fP, it is excluded. There is no short
-form for this option.
-.TP
-\fB-F\fP, \fB--fixed-strings\fP
-Interpret each pattern as a list of fixed strings, separated by newlines,
-instead of as a regular expression. The \fB-w\fP (match as a word) and \fB-x\fP
-(match whole line) options can be used with \fB-F\fP. They apply to each of the
-fixed strings. A line is selected if any of the fixed strings are found in it
-(subject to \fB-w\fP or \fB-x\fP, if present).
-.TP
-\fB-f\fP \fIfilename\fP, \fB--file=\fP\fIfilename\fP
-Read a number of patterns from the file, one per line, and match them against
-each line of input. A data line is output if any of the patterns match it. The
-filename can be given as "-" to refer to the standard input. When \fB-f\fP is
-used, patterns specified on the command line using \fB-e\fP may also be
-present; they are tested before the file's patterns. However, no other pattern
-is taken from the command line; all arguments are treated as file names. There
-is an overall maximum of 100 patterns. Trailing white space is removed from
-each line, and blank lines are ignored. An empty file contains no patterns and
-therefore matches nothing.
-.TP
-\fB-H\fP, \fB--with-filename\fP
-Force the inclusion of the filename at the start of output lines when searching
-a single file. By default, the filename is not shown in this case. For matching
-lines, the filename is followed by a colon and a space; for context lines, a
-hyphen separator is used. If a line number is also being output, it follows the
-file name without a space.
-.TP
-\fB-h\fP, \fB--no-filename\fP
-Suppress the output filenames when searching multiple files. By default,
-filenames are shown when multiple files are searched. For matching lines, the
-filename is followed by a colon and a space; for context lines, a hyphen
-separator is used. If a line number is also being output, it follows the file
-name without a space.
+\fB-V\fP
+Write the version number of the PCRE library being used to the standard error
+stream.
+.TP
+\fB-c\fP
+Do not print individual lines; instead just print a count of the number of
+lines that would otherwise have been printed. If several files are given, a
+count is printed for each of them.
+.TP
+\fB-f\fP\fIfilename\fP
+Read a number of patterns from the file, one per line, and match all of them
+against each line of input. A line is output if any of the patterns match it.
+When \fB-f\fP is used, no pattern is taken from the command line; all arguments
+are treated as file names. There is a maximum of 100 patterns. Trailing white
+space is removed, and blank lines are ignored. An empty file contains no
+patterns and therefore matches nothing.
 .TP
-\fB--help\fP
-Output a brief help message and exit.
+\fB-h\fP
+Suppress printing of filenames when searching multiple files.
 .TP
-\fB-i\fP, \fB--ignore-case\fP
+\fB-i\fP
 Ignore upper/lower case distinctions during comparisons.
 .TP
-\fB--include\fP=\fIpattern\fP
-When \fBpcregrep\fP is searching the files in a directory as a consequence of
-the \fB-r\fP (recursive search) option, only those files whose names match the
-pattern are included. The pattern is a PCRE regular expression. If a file name
-matches both \fB--include\fP and \fB--exclude\fP, it is excluded. There is no
-short form for this option.
-.TP
-\fB-L\fP, \fB--files-without-match\fP
-Instead of outputting lines from the files, just output the names of the files
-that do not contain any lines that would have been output. Each file name is
-output once, on a separate line.
-.TP
-\fB-l\fP, \fB--files-with-matches\fP
-Instead of outputting lines from the files, just output the names of the files
-containing lines that would have been output. Each file name is output
-once, on a separate line. Searching stops as soon as a matching line is found
-in a file.
-.TP
-\fB--label\fP=\fIname\fP
-This option supplies a name to be used for the standard input when file names
-are being output. If not supplied, "(standard input)" is used. There is no
-short form for this option.
-.TP
-\fB--locale\fP=\fIlocale-name\fP
-This option specifies a locale to be used for pattern matching. It overrides
-the value in the \fBLC_ALL\fP or \fBLC_CTYPE\fP environment variables. If no
-locale is specified, the PCRE library's default (usually the "C" locale) is
-used. There is no short form for this option.
-.TP
-\fB-M\fP, \fB--multiline\fP
-Allow patterns to match more than one line. When this option is given, patterns
-may usefully contain literal newline characters and internal occurrences of ^
-and $ characters. The output for any one match may consist of more than one
-line. When this option is set, the PCRE library is called in "multiline" mode.
-There is a limit to the number of lines that can be matched, imposed by the way
-that \fBpcregrep\fP buffers the input file as it scans it. However,
-\fBpcregrep\fP ensures that at least 8K characters or the rest of the document
-(whichever is the shorter) are available for forward matching, and similarly
-the previous 8K characters (or all the previous characters, if fewer than 8K)
-are guaranteed to be available for lookbehind assertions.
-.TP
-\fB-N\fP \fInewline-type\fP, \fB--newline=\fP\fInewline-type\fP
-The PCRE library supports five different conventions for indicating
-the ends of lines. They are the single-character sequences CR (carriage return)
-and LF (linefeed), the two-character sequence CRLF, an "anycrlf" convention,
-which recognizes any of the preceding three types, and an "any" convention, in
-which any Unicode line ending sequence is assumed to end a line. The Unicode
-sequences are the three just mentioned, plus VT (vertical tab, U+000B), FF
-(formfeed, U+000C), NEL (next line, U+0085), LS (line separator, U+2028), and
-PS (paragraph separator, U+2029).
-.sp
-When the PCRE library is built, a default line-ending sequence is specified.
-This is normally the standard sequence for the operating system. Unless
-otherwise specified by this option, \fBpcregrep\fP uses the library's default.
-The possible values for this option are CR, LF, CRLF, ANYCRLF, or ANY. This
-makes it possible to use \fBpcregrep\fP on files that have come from other
-environments without having to modify their line endings. If the data that is
-being scanned does not agree with the convention set by this option,
-\fBpcregrep\fP may behave in strange ways.
-.TP
-\fB-n\fP, \fB--line-number\fP
-Precede each output line by its line number in the file, followed by a colon
-and a space for matching lines or a hyphen and a space for context lines. If
-the filename is also being output, it precedes the line number.
-.TP
-\fB-o\fP, \fB--only-matching\fP
-Show only the part of the line that matched a pattern. In this mode, no
-context is shown. That is, the \fB-A\fP, \fB-B\fP, and \fB-C\fP options are
-ignored.
-.TP
-\fB-q\fP, \fB--quiet\fP
-Work quietly, that is, display nothing except error messages. The exit
-status indicates whether or not any matches were found.
-.TP
-\fB-r\fP, \fB--recursive\fP
-If any given path is a directory, recursively scan the files it contains,
-taking note of any \fB--include\fP and \fB--exclude\fP settings. By default, a
-directory is read as a normal file; in some operating systems this gives an
-immediate end-of-file. This option is a shorthand for setting the \fB-d\fP
-option to "recurse".
-.TP
-\fB-s\fP, \fB--no-messages\fP
-Suppress error messages about non-existent or unreadable files. Such files are
-quietly skipped. However, the return code is still 2, even if matches were
-found in other files.
+\fB-l\fP
+Instead of printing lines from the files, just print the names of the files
+containing lines that would have been printed. Each file name is printed
+once, on a separate line.
+.TP
+\fB-n\fP
+Precede each line by its line number in the file.
+.TP
+\fB-r\fP
+If any file is a directory, recursively scan the files it contains. Without
+\fB-r\fP a directory is scanned as a normal file.
+.TP
+\fB-s\fP
+Work silently, that is, display nothing except error messages.
+The exit status indicates whether any matches were found.
 .TP
-\fB-u\fP, \fB--utf-8\fP
+\fB-u\fP
 Operate in UTF-8 mode. This option is available only if PCRE has been compiled
-with UTF-8 support. Both patterns and subject lines must be valid strings of
-UTF-8 characters.
+with UTF-8 support. Both the pattern and each subject line must be valid
+strings of UTF-8 characters.
 .TP
-\fB-V\fP, \fB--version\fP
-Write the version numbers of \fBpcregrep\fP and the PCRE library that is being
-used to the standard error stream.
-.TP
-\fB-v\fP, \fB--invert-match\fP
-Invert the sense of the match, so that lines which do \fInot\fP match any of
-the patterns are the ones that are found.
-.TP
-\fB-w\fP, \fB--word-regex\fP, \fB--word-regexp\fP
-Force the patterns to match only whole words. This is equivalent to having \eb
-at the start and end of the pattern.
-.TP
-\fB-x\fP, \fB--line-regex\fP, \fB--line-regexp\fP
-Force the patterns to be anchored (each must start matching at the beginning of
-a line) and in addition, require them to match entire lines. This is
+\fB-v\fP
+Invert the sense of the match, so that lines which do \fInot\fP match the
+pattern are now the ones that are found.
+.TP
+\fB-x\fP
+Force the pattern to be anchored (it must start matching at the beginning of
+the line) and in addition, require it to match the entire line. This is
 equivalent to having ^ and $ characters at the start and end of each
-alternative branch in every pattern.
-.
+alternative branch in the regular expression.
 .
-.SH "ENVIRONMENT VARIABLES"
+.SH "LONG OPTIONS"
 .rs
 .sp
-The environment variables \fBLC_ALL\fP and \fBLC_CTYPE\fP are examined, in that
-order, for a locale. The first one that is set is used. This can be overridden
-by the \fB--locale\fP option. If no locale is set, the PCRE library's default
-(usually the "C" locale) is used.
-.
-.
-.SH "NEWLINES"
-.rs
+Long forms of all the options are available, as in GNU grep. They are shown in
+the following table:
 .sp
-The \fB-N\fP (\fB--newline\fP) option allows \fBpcregrep\fP to scan files with
-different newline conventions from the default. However, the setting of this
-option does not affect the way in which \fBpcregrep\fP writes information to
-the standard error and output streams. It uses the string "\en" in C
-\fBprintf()\fP calls to indicate newlines, relying on the C I/O library to
-convert this to an appropriate sequence if the output is sent to a file.
-.
-.
-.SH "OPTIONS COMPATIBILITY"
-.rs
+  -c   --count
+  -h   --no-filename
+  -i   --ignore-case
+  -l   --files-with-matches
+  -n   --line-number
+  -r   --recursive
+  -s   --no-messages
+  -u   --utf-8
+  -V   --version
+  -v   --invert-match
+  -x   --line-regex
+  -x   --line-regexp
 .sp
-The majority of short and long forms of \fBpcregrep\fP's options are the same
-as in the GNU \fBgrep\fP program. Any long option of the form
-\fB--xxx-regexp\fP (GNU terminology) is also available as \fB--xxx-regex\fP
-(PCRE terminology). However, the \fB--locale\fP, \fB-M\fP, \fB--multiline\fP,
-\fB-u\fP, and \fB--utf-8\fP options are specific to \fBpcregrep\fP.
-.
-.
-.SH "OPTIONS WITH DATA"
-.rs
-.sp
-There are four different ways in which an option with data can be specified.
-If a short form option is used, the data may follow immediately, or in the next
-command line item. For example:
-.sp
-  -f/some/file
-  -f /some/file
-.sp
-If a long form option is used, the data may appear in the same command line
-item, separated by an equals character, or (with one exception) it may appear
-in the next command line item. For example:
-.sp
-  --file=/some/file
-  --file /some/file
-.sp
-Note, however, that if you want to supply a file name beginning with ~ as data
-in a shell command, and have the shell expand ~ to a home directory, you must
-separate the file name from the option, because the shell does not treat ~
-specially unless it is at the start of an item.
-.P
-The exception to the above is the \fB--colour\fP (or \fB--color\fP) option,
-for which the data is optional. If this option does have data, it must be given
-in the first form, using an equals character. Otherwise it will be assumed that
-it has no data.
-.
-.
-.SH "MATCHING ERRORS"
-.rs
-.sp
-It is possible to supply a regular expression that takes a very long time to
-fail to match certain lines. Such patterns normally involve nested indefinite
-repeats, for example: (a+)*\ed when matched against a line of a's with no final
-digit. The PCRE matching function has a resource limit that causes it to abort
-in these circumstances. If this happens, \fBpcregrep\fP outputs an error
-message and the line that caused the problem to the standard error stream. If
-there are more than 20 such errors, \fBpcregrep\fP gives up.
-.
+In addition, --file=\fIfilename\fP is equivalent to -f\fIfilename\fP, and
+--help shows the list of options and then exits.
 .
 .SH DIAGNOSTICS
 .rs
 .sp
 Exit status is 0 if any matches were found, 1 if no matches were found, and 2
-for syntax errors and non-existent or inacessible files (even if matches were
-found in other files) or too many matching errors. Using the \fB-s\fP option to
-suppress error messages about inaccessble files does not affect the return
-code.
-.
-.
-.SH "SEE ALSO"
-.rs
-.sp
-\fBpcrepattern\fP(3), \fBpcretest\fP(1).
+for syntax errors or inacessible files (even if matches were found).
 .
 .
 .SH AUTHOR
 .rs
 .sp
-.nf
-Philip Hazel
+Philip Hazel <ph10@cam.ac.uk>
+.br
 University Computing Service
-Cambridge CB2 3QH, England.
-.fi
-.
-.
-.SH REVISION
-.rs
-.sp
-.nf
-Last updated: 16 April 2007
-Copyright (c) 1997-2007 University of Cambridge.
-.fi
+.br
+Cambridge CB2 3QG, England.
+.P
+.in 0
+Last updated: 09 September 2004
+.br
+Copyright (c) 1997-2004 University of Cambridge.

Modified: httpd/httpd/vendor/pcre/current/doc/pcregrep.txt
URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/doc/pcregrep.txt?rev=598343&r1=598342&r2=598343&view=diff
==============================================================================
--- httpd/httpd/vendor/pcre/current/doc/pcregrep.txt (original)
+++ httpd/httpd/vendor/pcre/current/doc/pcregrep.txt Mon Nov 26 09:04:19 2007
@@ -1,12 +1,12 @@
 PCREGREP(1)                                                        PCREGREP(1)
 
 
+
 NAME
        pcregrep - a grep with Perl-compatible regular expressions.
 
-
 SYNOPSIS
-       pcregrep [options] [long options] [pattern] [path1 path2 ...]
+       pcregrep [-Vcfhilnrsuvx] [long options] [pattern] [file1 file2 ...]
 
 
 DESCRIPTION
@@ -14,402 +14,109 @@
        pcregrep  searches  files  for  character  patterns, in the same way as
        other grep commands do, but it uses the PCRE regular expression library
        to support patterns that are compatible with the regular expressions of
-       Perl 5. See pcrepattern(3) for a full description of syntax and  seman-
-       tics of the regular expressions that PCRE supports.
-
-       Patterns,  whether  supplied on the command line or in a separate file,
-       are given without delimiters. For example:
+       Perl 5. See pcrepattern for a full description of syntax and  semantics
+       of the regular expressions that PCRE supports.
 
-         pcregrep Thursday /etc/motd
+       A pattern must be specified on the command line unless the -f option is
+       used (see below).
 
-       If you attempt to use delimiters (for example, by surrounding a pattern
-       with  slashes,  as  is common in Perl scripts), they are interpreted as
-       part of the pattern. Quotes can of course be used on the  command  line
-       because they are interpreted by the shell, and indeed they are required
-       if a pattern contains white space or shell metacharacters.
-
-       The first argument that follows any option settings is treated  as  the
-       single  pattern  to be matched when neither -e nor -f is present.  Con-
-       versely, when one or both of these options are  used  to  specify  pat-
-       terns, all arguments are treated as path names. At least one of -e, -f,
-       or an argument pattern must be provided.
-
-       If no files are specified, pcregrep reads the standard input. The stan-
-       dard  input  can  also  be  referenced by a name consisting of a single
-       hyphen.  For example:
-
-         pcregrep some-pattern /file1 - /file3
-
-       By default, each line that matches the pattern is copied to  the  stan-
-       dard  output, and if there is more than one file, the file name is out-
-       put at the start of each line. However,  there  are  options  that  can
-       change how pcregrep behaves. In particular, the -M option makes it pos-
-       sible to search for patterns that span line boundaries. What defines  a
-       line boundary is controlled by the -N (--newline) option.
-
-       Patterns  are  limited  to  8K  or  BUFSIZ characters, whichever is the
-       greater.  BUFSIZ is defined in <stdio.h>.
-
-       If the LC_ALL or LC_CTYPE environment variable is  set,  pcregrep  uses
-       the  value to set a locale when calling the PCRE library.  The --locale
-       option can be used to override this.
+       If no files are  specified,  pcregrep  reads  the  standard  input.  By
+       default,  each  line that matches the pattern is copied to the standard
+       output, and if there is more than one file, the file  name  is  printed
+       before  each line of output. However, there are options that can change
+       how pcregrep behaves.
+
+       Lines are limited to BUFSIZ characters. BUFSIZ is defined in <stdio.h>.
+       The newline character is removed from the end of each line before it is
+       matched against the pattern.
 
 
 OPTIONS
 
-       --        This terminate the list of options. It is useful if the  next
-                 item  on  the command line starts with a hyphen but is not an
-                 option. This allows for the processing of patterns and  file-
-                 names that start with hyphens.
-
-       -A number, --after-context=number
-                 Output  number  lines of context after each matching line. If
-                 filenames and/or line numbers are being output, a hyphen sep-
-                 arator  is  used  instead of a colon for the context lines. A
-                 line containing "--" is output between each group  of  lines,
-                 unless  they  are  in  fact contiguous in the input file. The
-                 value of number is expected to be relatively small.  However,
-                 pcregrep guarantees to have up to 8K of following text avail-
-                 able for context output.
-
-       -B number, --before-context=number
-                 Output number lines of context before each matching line.  If
-                 filenames and/or line numbers are being output, a hyphen sep-
-                 arator is used instead of a colon for the  context  lines.  A
-                 line  containing  "--" is output between each group of lines,
-                 unless they are in fact contiguous in  the  input  file.  The
-                 value  of number is expected to be relatively small. However,
-                 pcregrep guarantees to have up to 8K of preceding text avail-
-                 able for context output.
-
-       -C number, --context=number
-                 Output  number  lines  of  context both before and after each
-                 matching line.  This is equivalent to setting both -A and  -B
-                 to the same value.
-
-       -c, --count
-                 Do  not  output individual lines; instead just output a count
-                 of the number of lines that would otherwise have been output.
-                 If  several  files  are  given, a count is output for each of
-                 them. In this mode, the -A, -B, and -C options are ignored.
-
-       --colour, --color
-                 If this option is given without any data, it is equivalent to
-                 "--colour=auto".   If  data  is required, it must be given in
-                 the same shell item, separated by an equals sign.
-
-       --colour=value, --color=value
-                 This option specifies under what circumstances the part of  a
-                 line that matched a pattern should be coloured in the output.
-                 The value may be "never" (the default), "always", or  "auto".
-                 In  the  latter  case, colouring happens only if the standard
-                 output is connected to a terminal. The colour can  be  speci-
-                 fied  by  setting the environment variable PCREGREP_COLOUR or
-                 PCREGREP_COLOR. The value of this variable should be a string
-                 of  two  numbers,  separated by a semicolon.  They are copied
-                 directly into the control string for setting colour on a ter-
-                 minal,  so it is your responsibility to ensure that they make
-                 sense. If neither of the environment variables  is  set,  the
-                 default is "1;31", which gives red.
-
-       -D action, --devices=action
-                 If  an  input  path  is  not  a  regular file or a directory,
-                 "action" specifies how it is to be  processed.  Valid  values
-                 are  "read" (the default) or "skip" (silently skip the path).
-
-       -d action, --directories=action
-                 If an input path is a directory, "action" specifies how it is
-                 to  be  processed.   Valid  values  are "read" (the default),
-                 "recurse" (equivalent to the -r option), or "skip"  (silently
-                 skip  the path). In the default case, directories are read as
-                 if they were ordinary files. In some  operating  systems  the
-                 effect  of reading a directory like this is an immediate end-
-                 of-file.
-
-       -e pattern, --regex=pattern,
-                 --regexp=pattern Specify a pattern to be matched. This option
-                 can  be  used multiple times in order to specify several pat-
-                 terns. It can also be used as a way of  specifying  a  single
-                 pattern  that starts with a hyphen. When -e is used, no argu-
-                 ment pattern is taken from the command  line;  all  arguments
-                 are treated as file names. There is an overall maximum of 100
-                 patterns. They are applied to each line in the order in which
-                 they  are  defined until one matches (or fails to match if -v
-                 is used). If -f is used with -e, the  command  line  patterns
-                 are  matched  first,  followed by the patterns from the file,
-                 independent of the order in which these  options  are  speci-
-                 fied.  Note that multiple use of -e is not the same as a sin-
-                 gle pattern with alternatives. For  example,  X|Y  finds  the
-                 first  character in a line that is X or Y, whereas if the two
-                 patterns are given separately, pcregrep  finds  X  if  it  is
-                 present, even if it follows Y in the line. It finds Y only if
-                 there is no X in the line. This really matters  only  if  you
-                 are using -o to show the portion of the line that matched.
-
-       --exclude=pattern
-                 When pcregrep is searching the files in a directory as a con-
-                 sequence of the -r (recursive search) option, any files whose
-                 names  match  the pattern are excluded. The pattern is a PCRE
-                 regular expression. If a file name matches both --include and
-                 --exclude,  it  is  excluded. There is no short form for this
-                 option.
-
-       -F, --fixed-strings
-                 Interpret each pattern as a list of fixed strings,  separated
-                 by  newlines,  instead  of  as  a  regular expression. The -w
-                 (match as a word) and -x (match whole line)  options  can  be
-                 used with -F. They apply to each of the fixed strings. A line
-                 is selected if any of the fixed strings are found in it (sub-
-                 ject to -w or -x, if present).
-
-       -f filename, --file=filename
-                 Read  a  number  of patterns from the file, one per line, and
-                 match them against each line of input. A data line is  output
-                 if any of the patterns match it. The filename can be given as
-                 "-" to refer to the standard input. When -f is used, patterns
-                 specified  on  the command line using -e may also be present;
-                 they are tested before the file's patterns. However, no other
-                 pattern  is  taken  from  the command line; all arguments are
-                 treated as file names. There is an  overall  maximum  of  100
-                 patterns. Trailing white space is removed from each line, and
-                 blank lines are ignored. An empty file contains  no  patterns
-                 and therefore matches nothing.
-
-       -H, --with-filename
-                 Force  the  inclusion  of the filename at the start of output
-                 lines when searching a single file. By default, the  filename
-                 is  not  shown in this case. For matching lines, the filename
-                 is followed by a colon and a  space;  for  context  lines,  a
-                 hyphen separator is used. If a line number is also being out-
-                 put, it follows the file name without a space.
-
-       -h, --no-filename
-                 Suppress the output filenames when searching multiple  files.
-                 By  default,  filenames  are  shown  when  multiple files are
-                 searched. For matching lines, the filename is followed  by  a
-                 colon  and  a space; for context lines, a hyphen separator is
-                 used. If a line number is also being output, it  follows  the
-                 file name without a space.
-
-       --help    Output a brief help message and exit.
-
-       -i, --ignore-case
-                 Ignore upper/lower case distinctions during comparisons.
-
-       --include=pattern
-                 When pcregrep is searching the files in a directory as a con-
-                 sequence of the -r  (recursive  search)  option,  only  those
-                 files whose names match the pattern are included. The pattern
-                 is a PCRE regular expression. If a  file  name  matches  both
-                 --include  and  --exclude,  it is excluded. There is no short
-                 form for this option.
-
-       -L, --files-without-match
-                 Instead of outputting lines from the files, just  output  the
-                 names  of  the files that do not contain any lines that would
-                 have been output. Each file name is output once, on  a  sepa-
-                 rate line.
-
-       -l, --files-with-matches
-                 Instead  of  outputting lines from the files, just output the
-                 names of the files containing lines that would have been out-
-                 put.  Each  file  name  is  output  once, on a separate line.
-                 Searching stops as soon as a matching  line  is  found  in  a
-                 file.
-
-       --label=name
-                 This option supplies a name to be used for the standard input
-                 when file names are being output. If not supplied, "(standard
-                 input)" is used. There is no short form for this option.
-
-       --locale=locale-name
-                 This  option specifies a locale to be used for pattern match-
-                 ing. It overrides the value in the LC_ALL or  LC_CTYPE  envi-
-                 ronment  variables.  If  no  locale  is  specified,  the PCRE
-                 library's default (usually the "C" locale) is used. There  is
-                 no short form for this option.
-
-       -M, --multiline
-                 Allow  patterns to match more than one line. When this option
-                 is given, patterns may usefully contain literal newline char-
-                 acters  and  internal  occurrences of ^ and $ characters. The
-                 output for any one match may consist of more than  one  line.
-                 When  this option is set, the PCRE library is called in "mul-
-                 tiline" mode.  There is a limit to the number of  lines  that
-                 can  be matched, imposed by the way that pcregrep buffers the
-                 input file as it scans it. However, pcregrep ensures that  at
-                 least 8K characters or the rest of the document (whichever is
-                 the shorter) are available for forward  matching,  and  simi-
-                 larly the previous 8K characters (or all the previous charac-
-                 ters, if fewer than 8K) are guaranteed to  be  available  for
-                 lookbehind assertions.
-
-       -N newline-type, --newline=newline-type
-                 The  PCRE  library  supports  five  different conventions for
-                 indicating the ends of lines. They are  the  single-character
-                 sequences  CR  (carriage  return) and LF (linefeed), the two-
-                 character sequence CRLF, an "anycrlf" convention, which  rec-
-                 ognizes  any  of the preceding three types, and an "any" con-
-                 vention, in which any Unicode line ending sequence is assumed
-                 to  end a line. The Unicode sequences are the three just men-
-                 tioned,  plus  VT  (vertical  tab,  U+000B),  FF   (formfeed,
-                 U+000C),   NEL  (next  line,  U+0085),  LS  (line  separator,
-                 U+2028), and PS (paragraph separator, U+2029).
-
-                 When  the  PCRE  library  is  built,  a  default  line-ending
-                 sequence   is  specified.   This  is  normally  the  standard
-                 sequence for the operating system. Unless otherwise specified
-                 by  this  option,  pcregrep  uses the library's default.  The
-                 possible values for this option are CR, LF, CRLF, ANYCRLF, or
-                 ANY.  This  makes  it  possible to use pcregrep on files that
-                 have come from other environments without  having  to  modify
-                 their  line  endings.  If the data that is being scanned does
-                 not agree with the convention set by  this  option,  pcregrep
-                 may behave in strange ways.
-
-       -n, --line-number
-                 Precede each output line by its line number in the file, fol-
-                 lowed by a colon and a space for matching lines or  a  hyphen
-                 and  a space for context lines. If the filename is also being
-                 output, it precedes the line number.
-
-       -o, --only-matching
-                 Show only the part of the line that  matched  a  pattern.  In
-                 this  mode,  no context is shown. That is, the -A, -B, and -C
-                 options are ignored.
-
-       -q, --quiet
-                 Work quietly, that is, display nothing except error messages.
-                 The  exit  status  indicates  whether or not any matches were
-                 found.
-
-       -r, --recursive
-                 If any given path is a directory, recursively scan the  files
-                 it  contains, taking note of any --include and --exclude set-
-                 tings. By default, a directory is read as a normal  file;  in
-                 some  operating  systems this gives an immediate end-of-file.
-                 This option is a shorthand  for  setting  the  -d  option  to
-                 "recurse".
-
-       -s, --no-messages
-                 Suppress  error  messages  about  non-existent  or unreadable
-                 files. Such files are quietly skipped.  However,  the  return
-                 code is still 2, even if matches were found in other files.
-
-       -u, --utf-8
-                 Operate  in UTF-8 mode. This option is available only if PCRE
-                 has been compiled with UTF-8 support. Both patterns and  sub-
-                 ject lines must be valid strings of UTF-8 characters.
-
-       -V, --version
-                 Write  the  version  numbers of pcregrep and the PCRE library
-                 that is being used to the standard error stream.
-
-       -v, --invert-match
-                 Invert the sense of the match, so that  lines  which  do  not
-                 match any of the patterns are the ones that are found.
-
-       -w, --word-regex, --word-regexp
-                 Force the patterns to match only whole words. This is equiva-
-                 lent to having \b at the start and end of the pattern.
-
-       -x, --line-regex, --line-regexp
-                 Force the patterns to be anchored (each must  start  matching
-                 at  the beginning of a line) and in addition, require them to
-                 match entire lines. This is equivalent  to  having  ^  and  $
-                 characters at the start and end of each alternative branch in
-                 every pattern.
-
-
-ENVIRONMENT VARIABLES
-
-       The environment variables LC_ALL and LC_CTYPE  are  examined,  in  that
-       order,  for  a  locale.  The first one that is set is used. This can be
-       overridden by the --locale option.  If  no  locale  is  set,  the  PCRE
-       library's default (usually the "C" locale) is used.
-
 
-NEWLINES
+       -V        Write the version number of the PCRE library  being  used  to
+                 the standard error stream.
 
-       The  -N (--newline) option allows pcregrep to scan files with different
-       newline conventions from the default.  However,  the  setting  of  this
-       option  does not affect the way in which pcregrep writes information to
-       the standard error and output streams. It uses the  string  "\n"  in  C
-       printf()  calls  to  indicate newlines, relying on the C I/O library to
-       convert this to an appropriate sequence if the  output  is  sent  to  a
-       file.
-
-
-OPTIONS COMPATIBILITY
+       -c        Do  not print individual lines; instead just print a count of
+                 the number of lines that would otherwise have  been  printed.
+                 If  several  files  are given, a count is printed for each of
+                 them.
+
+       -ffilename
+                 Read a number of patterns from the file, one  per  line,  and
+                 match  all of them against each line of input. A line is out-
+                 put if any of the patterns match it.  When  -f  is  used,  no
+                 pattern  is  taken  from  the command line; all arguments are
+                 treated as file names. There is a maximum  of  100  patterns.
+                 Trailing white space is removed, and blank lines are ignored.
+                 An empty file contains  no  patterns  and  therefore  matches
+                 nothing.
 
-       The majority of short and long forms of pcregrep's options are the same
-       as in the GNU grep program. Any long option of  the  form  --xxx-regexp
-       (GNU  terminology) is also available as --xxx-regex (PCRE terminology).
-       However, the --locale, -M, --multiline, -u,  and  --utf-8  options  are
-       specific to pcregrep.
+       -h        Suppress printing of filenames when searching multiple files.
 
+       -i        Ignore upper/lower case distinctions during comparisons.
 
-OPTIONS WITH DATA
+       -l        Instead of printing lines from  the  files,  just  print  the
+                 names  of  the  files  containing  lines that would have been
+                 printed. Each file name is printed once, on a separate  line.
 
-       There are four different ways in which an option with data can be spec-
-       ified.  If a short form option is used, the  data  may  follow  immedi-
-       ately, or in the next command line item. For example:
+       -n        Precede each line by its line number in the file.
 
-         -f/some/file
-         -f /some/file
+       -r        If  any  file  is  a directory, recursively scan the files it
+                 contains. Without -r a directory is scanned as a normal file.
 
-       If  a long form option is used, the data may appear in the same command
-       line item, separated by an equals character, or (with one exception) it
-       may appear in the next command line item. For example:
+       -s        Work  silently,  that  is,  display nothing except error mes-
+                 sages.  The exit status indicates whether  any  matches  were
+                 found.
 
-         --file=/some/file
-         --file /some/file
+       -u        Operate  in UTF-8 mode. This option is available only if PCRE
+                 has been compiled with UTF-8 support. Both  the  pattern  and
+                 each  subject line must be valid strings of UTF-8 characters.
+
+       -v        Invert the sense of the match, so that  lines  which  do  not
+                 match the pattern are now the ones that are found.
+
+       -x        Force  the  pattern to be anchored (it must start matching at
+                 the beginning of the line) and in  addition,  require  it  to
+                 match  the  entire line. This is equivalent to having ^ and $
+                 characters at the start and end of each alternative branch in
+                 the regular expression.
 
-       Note,  however, that if you want to supply a file name beginning with ~
-       as data in a shell command, and have the  shell  expand  ~  to  a  home
-       directory, you must separate the file name from the option, because the
-       shell does not treat ~ specially unless it is at the start of an  item.
 
-       The  exception  to  the  above is the --colour (or --color) option, for
-       which the data is optional. If this option does have data, it  must  be
-       given  in  the first form, using an equals character. Otherwise it will
-       be assumed that it has no data.
+LONG OPTIONS
 
+       Long  forms  of all the options are available, as in GNU grep. They are
+       shown in the following table:
 
-MATCHING ERRORS
+         -c   --count
+         -h   --no-filename
+         -i   --ignore-case
+         -l   --files-with-matches
+         -n   --line-number
+         -r   --recursive
+         -s   --no-messages
+         -u   --utf-8
+         -V   --version
+         -v   --invert-match
+         -x   --line-regex
+         -x   --line-regexp
 
-       It is possible to supply a regular expression that takes  a  very  long
-       time  to  fail  to  match certain lines. Such patterns normally involve
-       nested indefinite repeats, for example: (a+)*\d when matched against  a
-       line  of  a's  with  no  final  digit. The PCRE matching function has a
-       resource limit that causes it to abort in these circumstances. If  this
-       happens, pcregrep outputs an error message and the line that caused the
-       problem to the standard error stream. If there are more  than  20  such
-       errors, pcregrep gives up.
+       In addition, --file=filename is equivalent to  -ffilename,  and  --help
+       shows the list of options and then exits.
 
 
 DIAGNOSTICS
 
        Exit status is 0 if any matches were found, 1 if no matches were found,
-       and 2 for syntax errors and non-existent or inacessible files (even  if
-       matches  were  found in other files) or too many matching errors. Using
-       the -s option to suppress error messages about inaccessble  files  does
-       not affect the return code.
-
-
-SEE ALSO
-
-       pcrepattern(3), pcretest(1).
+       and 2 for syntax errors or inacessible  files  (even  if  matches  were
+       found).
 
 
 AUTHOR
 
-       Philip Hazel
+       Philip Hazel <ph10@cam.ac.uk>
        University Computing Service
-       Cambridge CB2 3QH, England.
-
-
-REVISION
+       Cambridge CB2 3QG, England.
 
-       Last updated: 16 April 2007
-       Copyright (c) 1997-2007 University of Cambridge.
+Last updated: 09 September 2004
+Copyright (c) 1997-2004 University of Cambridge.

Modified: httpd/httpd/vendor/pcre/current/doc/pcrepartial.3
URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/doc/pcrepartial.3?rev=598343&r1=598342&r2=598343&view=diff
==============================================================================
--- httpd/httpd/vendor/pcre/current/doc/pcrepartial.3 (original)
+++ httpd/httpd/vendor/pcre/current/doc/pcrepartial.3 Mon Nov 26 09:04:19 2007
@@ -1,14 +1,14 @@
-.TH PCREPARTIAL 3
+.TH PCRE 3
 .SH NAME
 PCRE - Perl-compatible regular expressions
 .SH "PARTIAL MATCHING IN PCRE"
 .rs
 .sp
 In normal use of PCRE, if the subject string that is passed to
-\fBpcre_exec()\fP or \fBpcre_dfa_exec()\fP matches as far as it goes, but is
-too short to match the entire pattern, PCRE_ERROR_NOMATCH is returned. There
-are circumstances where it might be helpful to distinguish this case from other
-cases in which there is no match.
+\fBpcre_exec()\fP matches as far as it goes, but is too short to match the
+entire pattern, PCRE_ERROR_NOMATCH is returned. There are circumstances where
+it might be helpful to distinguish this case from other cases in which there is
+no match.
 .P
 Consider, for example, an application where a human is required to type in data
 for a field with specific formatting requirements. An example might be a date
@@ -24,19 +24,10 @@
 entered.
 .P
 PCRE supports the concept of partial matching by means of the PCRE_PARTIAL
-option, which can be set when calling \fBpcre_exec()\fP or
-\fBpcre_dfa_exec()\fP. When this flag is set for \fBpcre_exec()\fP, the return
-code PCRE_ERROR_NOMATCH is converted into PCRE_ERROR_PARTIAL if at any time
-during the matching process the last part of the subject string matched part of
-the pattern. Unfortunately, for non-anchored matching, it is not possible to
-obtain the position of the start of the partial match. No captured data is set
-when PCRE_ERROR_PARTIAL is returned.
-.P
-When PCRE_PARTIAL is set for \fBpcre_dfa_exec()\fP, the return code
-PCRE_ERROR_NOMATCH is converted into PCRE_ERROR_PARTIAL if the end of the
-subject is reached, there have been no complete matches, but there is still at
-least one matching possibility. The portion of the string that provided the
-partial match is set as the first matching string.
+option, which can be set when calling \fBpcre_exec()\fP. When this is done, the
+return code PCRE_ERROR_NOMATCH is converted into PCRE_ERROR_PARTIAL if at any
+time during the matching process the entire subject string matched part of the
+pattern. No captured data is set when this occurs.
 .P
 Using PCRE_PARTIAL disables one of PCRE's optimizations. PCRE remembers the
 last literal byte in a pattern, and abandons matching immediately if such a
@@ -47,10 +38,9 @@
 .SH "RESTRICTED PATTERNS FOR PCRE_PARTIAL"
 .rs
 .sp
-Because of the way certain internal optimizations are implemented in the
-\fBpcre_exec()\fP function, the PCRE_PARTIAL option cannot be used with all
-patterns. These restrictions do not apply when \fBpcre_dfa_exec()\fP is used.
-For \fBpcre_exec()\fP, repeated single characters such as
+Because of the way certain internal optimizations are implemented in PCRE, the
+PCRE_PARTIAL option cannot be used with all patterns. Repeated single
+characters such as
 .sp
   a{2,4}
 .sp
@@ -71,8 +61,6 @@
 .P
 If PCRE_PARTIAL is set for a pattern that does not conform to the restrictions,
 \fBpcre_exec()\fP returns the error code PCRE_ERROR_BADPARTIAL (-13).
-You can use the PCRE_INFO_OKPARTIAL call to \fBpcre_fullinfo()\fP to find out
-if a compiled pattern can be used for partial matching.
 .
 .
 .SH "EXAMPLE OF PARTIAL MATCHING USING PCRETEST"
@@ -83,137 +71,25 @@
 uses the date example quoted above:
 .sp
     re> /^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$/
-  data> 25jun04\eP
+  data> 25jun04\P
    0: 25jun04
    1: jun
-  data> 25dec3\eP
+  data> 25dec3\P
   Partial match
-  data> 3ju\eP
+  data> 3ju\P
   Partial match
-  data> 3juj\eP
+  data> 3juj\P
   No match
-  data> j\eP
+  data> j\P
   No match
 .sp
 The first data string is matched completely, so \fBpcretest\fP shows the
 matched substrings. The remaining four strings do not match the complete
-pattern, but the first two are partial matches. The same test, using
-\fBpcre_dfa_exec()\fP matching (by means of the \eD escape sequence), produces
-the following output:
-.sp
-    re> /^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$/
-  data> 25jun04\eP\eD
-   0: 25jun04
-  data> 23dec3\eP\eD
-  Partial match: 23dec3
-  data> 3ju\eP\eD
-  Partial match: 3ju
-  data> 3juj\eP\eD
-  No match
-  data> j\eP\eD
-  No match
-.sp
-Notice that in this case the portion of the string that was matched is made
-available.
+pattern, but the first two are partial matches.
 .
 .
-.SH "MULTI-SEGMENT MATCHING WITH pcre_dfa_exec()"
-.rs
-.sp
-When a partial match has been found using \fBpcre_dfa_exec()\fP, it is possible
-to continue the match by providing additional subject data and calling
-\fBpcre_dfa_exec()\fP again with the same compiled regular expression, this
-time setting the PCRE_DFA_RESTART option. You must also pass the same working
-space as before, because this is where details of the previous partial match
-are stored. Here is an example using \fBpcretest\fP, using the \eR escape
-sequence to set the PCRE_DFA_RESTART option (\eP and \eD are as above):
-.sp
-    re> /^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$/
-  data> 23ja\eP\eD
-  Partial match: 23ja
-  data> n05\eR\eD
-   0: n05
-.sp
-The first call has "23ja" as the subject, and requests partial matching; the
-second call has "n05" as the subject for the continued (restarted) match.
-Notice that when the match is complete, only the last part is shown; PCRE does
-not retain the previously partially-matched string. It is up to the calling
-program to do that if it needs to.
-.P
-You can set PCRE_PARTIAL with PCRE_DFA_RESTART to continue partial matching
-over multiple segments. This facility can be used to pass very long subject
-strings to \fBpcre_dfa_exec()\fP. However, some care is needed for certain
-types of pattern.
-.P
-1. If the pattern contains tests for the beginning or end of a line, you need
-to pass the PCRE_NOTBOL or PCRE_NOTEOL options, as appropriate, when the
-subject string for any call does not contain the beginning or end of a line.
-.P
-2. If the pattern contains backward assertions (including \eb or \eB), you need
-to arrange for some overlap in the subject strings to allow for this. For
-example, you could pass the subject in chunks that are 500 bytes long, but in
-a buffer of 700 bytes, with the starting offset set to 200 and the previous 200
-bytes at the start of the buffer.
 .P
-3. Matching a subject string that is split into multiple segments does not
-always produce exactly the same result as matching over one single long string.
-The difference arises when there are multiple matching possibilities, because a
-partial match result is given only when there are no completed matches in a
-call to \fBpcre_dfa_exec()\fP. This means that as soon as the shortest match has
-been found, continuation to a new subject segment is no longer possible.
-Consider this \fBpcretest\fP example:
-.sp
-    re> /dog(sbody)?/
-  data> do\eP\eD
-  Partial match: do
-  data> gsb\eR\eP\eD
-   0: g
-  data> dogsbody\eD
-   0: dogsbody
-   1: dog
-.sp
-The pattern matches the words "dog" or "dogsbody". When the subject is
-presented in several parts ("do" and "gsb" being the first two) the match stops
-when "dog" has been found, and it is not possible to continue. On the other
-hand, if "dogsbody" is presented as a single string, both matches are found.
-.P
-Because of this phenomenon, it does not usually make sense to end a pattern
-that is going to be matched in this way with a variable repeat.
-.P
-4. Patterns that contain alternatives at the top level which do not all
-start with the same pattern item may not work as expected. For example,
-consider this pattern:
-.sp
-  1234|3789
-.sp
-If the first part of the subject is "ABC123", a partial match of the first
-alternative is found at offset 3. There is no partial match for the second
-alternative, because such a match does not start at the same point in the
-subject string. Attempting to continue with the string "789" does not yield a
-match because only those alternatives that match at one point in the subject
-are remembered. The problem arises because the start of the second alternative
-matches within the first alternative. There is no problem with anchored
-patterns or patterns such as:
-.sp
-  1234|ABCD
-.sp
-where no string can be a partial match for both alternatives.
-.
-.
-.SH AUTHOR
-.rs
-.sp
-.nf
-Philip Hazel
-University Computing Service
-Cambridge CB2 3QH, England.
-.fi
-.
-.
-.SH REVISION
-.rs
-.sp
-.nf
-Last updated: 04 June 2007
-Copyright (c) 1997-2007 University of Cambridge.
-.fi
+.in 0
+Last updated: 08 September 2004
+.br
+Copyright (c) 1997-2004 University of Cambridge.



Mime
View raw message