Return-Path: Delivered-To: apmail-httpd-cvs-archive@www.apache.org Received: (qmail 39025 invoked from network); 26 Nov 2007 17:05:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 26 Nov 2007 17:05:58 -0000 Received: (qmail 3572 invoked by uid 500); 26 Nov 2007 17:05:44 -0000 Delivered-To: apmail-httpd-cvs-archive@httpd.apache.org Received: (qmail 3500 invoked by uid 500); 26 Nov 2007 17:05:44 -0000 Mailing-List: contact cvs-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list cvs@httpd.apache.org Received: (qmail 3286 invoked by uid 99); 26 Nov 2007 17:05:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Nov 2007 09:05:43 -0800 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO eris.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Nov 2007 17:05:19 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id 6F24B1A984A; Mon, 26 Nov 2007 09:05:22 -0800 (PST) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r598343 [3/22] - in /httpd/httpd/vendor/pcre/current: ./ doc/ doc/html/ testdata/ Date: Mon, 26 Nov 2007 17:04:37 -0000 To: cvs@httpd.apache.org From: pgollucci@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20071126170522.6F24B1A984A@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Modified: httpd/httpd/vendor/pcre/current/NEWS URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/NEWS?rev=598343&r1=598342&r2=598343&view=diff ============================================================================== --- httpd/httpd/vendor/pcre/current/NEWS (original) +++ httpd/httpd/vendor/pcre/current/NEWS Mon Nov 26 09:04:19 2007 @@ -1,197 +1,6 @@ News about PCRE releases ------------------------ - -Release 7.4 21-Sep-07 ---------------------- - -The only change of specification is the addition of options to control whether -\R matches any Unicode line ending (the default) or just CR, LF, and CRLF. -Otherwise, the changes are bug fixes and a refactoring to reduce the number of -relocations needed in a shared library. There have also been some documentation -updates, in particular, some more information about using CMake to build PCRE -has been added to the NON-UNIX-USE file. - - -Release 7.3 28-Aug-07 ---------------------- - -Most changes are bug fixes. Some that are not: - -1. There is some support for Perl 5.10's experimental "backtracking control - verbs" such as (*PRUNE). - -2. UTF-8 checking is now as per RFC 3629 instead of RFC 2279; this is more - restrictive in the strings it accepts. - -3. Checking for potential integer overflow has been made more dynamic, and as a - consequence there is no longer a hard limit on the size of a subpattern that - has a limited repeat count. - -4. When CRLF is a valid line-ending sequence, pcre_exec() and pcre_dfa_exec() - no longer advance by two characters instead of one when an unanchored match - fails at CRLF if there are explicit CR or LF matches within the pattern. - This gets rid of some anomalous effects that previously occurred. - -5. Some PCRE-specific settings for varying the newline options at the start of - a pattern have been added. - - -Release 7.2 19-Jun-07 ---------------------- - -WARNING: saved patterns that were compiled by earlier versions of PCRE must be -recompiled for use with 7.2 (necessitated by the addition of \K, \h, \H, \v, -and \V). - -Correction to the notes for 7.1: the note about shared libraries for Windows is -wrong. Previously, three libraries were built, but each could function -independently. For example, the pcreposix library also included all the -functions from the basic pcre library. The change is that the three libraries -are no longer independent. They are like the Unix libraries. To use the -pcreposix functions, for example, you need to link with both the pcreposix and -the basic pcre library. - -Some more features from Perl 5.10 have been added: - - (?-n) and (?+n) relative references for recursion and subroutines. - - (?(-n) and (?(+n) relative references as conditions. - - \k{name} and \g{name} are synonyms for \k. - - \K to reset the start of the matched string; for example, (foo)\Kbar - matches bar preceded by foo, but only sets bar as the matched string. - - (?| introduces a group where the capturing parentheses in each alternative - start from the same number; for example, (?|(abc)|(xyz)) sets capturing - parentheses number 1 in both cases. - - \h, \H, \v, \V match horizontal and vertical whitespace, respectively. - - -Release 7.1 24-Apr-07 ---------------------- - -There is only one new feature in this release: a linebreak setting of -PCRE_NEWLINE_ANYCRLF. It is a cut-down version of PCRE_NEWLINE_ANY, which -recognizes only CRLF, CR, and LF as linebreaks. - -A few bugs are fixed (see ChangeLog for details), but the major change is a -complete re-implementation of the build system. This now has full Autotools -support and so is now "standard" in some sense. It should help with compiling -PCRE in a wide variety of environments. - -NOTE: when building shared libraries for Windows, three dlls are now built, -called libpcre, libpcreposix, and libpcrecpp. Previously, everything was -included in a single dll. - -Another important change is that the dftables auxiliary program is no longer -compiled and run at "make" time by default. Instead, a default set of character -tables (assuming ASCII coding) is used. If you want to use dftables to generate -the character tables as previously, add --enable-rebuild-chartables to the -"configure" command. You must do this if you are compiling PCRE to run on a -system that uses EBCDIC code. - -There is a discussion about character tables in the README file. The default is -not to use dftables so that that there is no problem when cross-compiling. - - -Release 7.0 19-Dec-06 ---------------------- - -This release has a new major number because there have been some internal -upheavals to facilitate the addition of new optimizations and other facilities, -and to make subsequent maintenance and extension easier. Compilation is likely -to be a bit slower, but there should be no major effect on runtime performance. -Previously compiled patterns are NOT upwards compatible with this release. If -you have saved compiled patterns from a previous release, you will have to -re-compile them. Important changes that are visible to users are: - -1. The Unicode property tables have been updated to Unicode 5.0.0, which adds - some more scripts. - -2. The option PCRE_NEWLINE_ANY causes PCRE to recognize any Unicode newline - sequence as a newline. - -3. The \R escape matches a single Unicode newline sequence as a single unit. - -4. New features that will appear in Perl 5.10 are now in PCRE. These include - alternative Perl syntax for named parentheses, and Perl syntax for - recursion. - -5. The C++ wrapper interface has been extended by the addition of a - QuoteMeta function and the ability to allow copy construction and - assignment. - -For a complete list of changes, see the ChangeLog file. - - -Release 6.7 04-Jul-06 ---------------------- - -The main additions to this release are the ability to use the same name for -multiple sets of parentheses, and support for CRLF line endings in both the -library and pcregrep (and in pcretest for testing). - -Thanks to Ian Taylor, the stack usage for many kinds of pattern has been -significantly reduced for certain subject strings. - - -Release 6.5 01-Feb-06 ---------------------- - -Important changes in this release: - -1. A number of new features have been added to pcregrep. - -2. The Unicode property tables have been updated to Unicode 4.1.0, and the - supported properties have been extended with script names such as "Arabic", - and the derived properties "Any" and "L&". This has necessitated a change to - the interal format of compiled patterns. Any saved compiled patterns that - use \p or \P must be recompiled. - -3. The specification of recursion in patterns has been changed so that all - recursive subpatterns are automatically treated as atomic groups. Thus, for - example, (?R) is treated as if it were (?>(?R)). This is necessary because - otherwise there are situations where recursion does not work. - -See the ChangeLog for a complete list of changes, which include a number of bug -fixes and tidies. - - -Release 6.0 07-Jun-05 ---------------------- - -The release number has been increased to 6.0 because of the addition of several -major new pieces of functionality. - -A new function, pcre_dfa_exec(), which implements pattern matching using a DFA -algorithm, has been added. This has a number of advantages for certain cases, -though it does run more slowly, and lacks the ability to capture substrings. On -the other hand, it does find all matches, not just the first, and it works -better for partial matching. The pcrematching man page discusses the -differences. - -The pcretest program has been enhanced so that it can make use of the new -pcre_dfa_exec() matching function and the extra features it provides. - -The distribution now includes a C++ wrapper library. This is built -automatically if a C++ compiler is found. The pcrecpp man page discusses this -interface. - -The code itself has been re-organized into many more files, one for each -function, so it no longer requires everything to be linked in when static -linkage is used. As a consequence, some internal functions have had to have -their names exposed. These functions all have names starting with _pcre_. They -are undocumented, and are not intended for use by outside callers. - -The pcregrep program has been enhanced with new functionality such as -multiline-matching and options for output more matching context. See the -ChangeLog for a complete list of changes to the library and the utility -programs. - - Release 5.0 13-Sep-04 --------------------- Modified: httpd/httpd/vendor/pcre/current/NON-UNIX-USE URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/NON-UNIX-USE?rev=598343&r1=598342&r2=598343&view=diff ============================================================================== --- httpd/httpd/vendor/pcre/current/NON-UNIX-USE (original) +++ httpd/httpd/vendor/pcre/current/NON-UNIX-USE Mon Nov 26 09:04:19 2007 @@ -1,330 +1,187 @@ Compiling PCRE on non-Unix systems ---------------------------------- -This document contains the following sections: - - General - Generic instructions for the PCRE C library - The C++ wrapper functions - Building for virtual Pascal - Stack size in Windows environments - Comments about Win32 builds - Building PCRE with CMake - Building under Windows with BCC5.5 - Building PCRE on OpenVMS - - -GENERAL - -I (Philip Hazel) have no experience of Windows or VMS sytems and how their -libraries work. The items in the PCRE distribution and Makefile that relate to -anything other than Unix-like systems are untested by me. - -There are some other comments and files in the Contrib directory on the ftp -site that you may find useful. See +See below for comments on Cygwin or MinGW and OpenVMS usage. I (Philip Hazel) +have no knowledge of Windows or VMS sytems and how their libraries work. The +items in the PCRE Makefile that relate to anything other than Unix-like systems +have been contributed by PCRE users. There are some other comments and files in +the Contrib directory on the ftp site that you may find useful. See ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Contrib -If you want to compile PCRE for a non-Unix system (especially for a system that -does not support "configure" and "make" files), note that the basic PCRE -library consists entirely of code written in Standard C, and so should compile -successfully on any system that has a Standard C compiler and library. The C++ -wrapper functions are a separate issue (see below). - -The PCRE distribution includes support for CMake. This support is relatively -new, but has already been used successfully to build PCRE in multiple build -environments on Windows. There are some instructions in the section entitled -"Building PCRE with CMake" below. - - -GENERIC INSTRUCTIONS FOR THE PCRE C LIBRARY - -The following are generic comments about building the PCRE C library "by hand". - - (1) Copy or rename the file config.h.generic as config.h, and edit the macro - settings that it contains to whatever is appropriate for your environment. - In particular, if you want to force a specific value for newline, you can - define the NEWLINE macro. When you compile any of the PCRE modules, you - must specify -DHAVE_CONFIG_H to your compiler so that config.h is included - in the sources. - - An alternative approach is not to edit config.h, but to use -D on the - compiler command line to make any changes that you need to the - configuration options. In this case -DHAVE_CONFIG_H must not be set. - - NOTE: There have been occasions when the way in which certain parameters - in config.h are used has changed between releases. (In the configure/make - world, this is handled automatically.) When upgrading to a new release, - you are strongly advised to review config.h.generic before re-using what - you had previously. - - (2) Copy or rename the file pcre.h.generic as pcre.h. - - (3) EITHER: - Copy or rename file pcre_chartables.c.dist as pcre_chartables.c. - - OR: - Compile dftables.c as a stand-alone program (using -DHAVE_CONFIG_H if - you have set up config.h), and then run it with the single argument - "pcre_chartables.c". This generates a set of standard character tables - and writes them to that file. The tables are generated using the default - C locale for your system. If you want to use a locale that is specified - by LC_xxx environment variables, add the -L option to the dftables - command. You must use this method if you are building on a system that - uses EBCDIC code. - - The tables in pcre_chartables.c are defaults. The caller of PCRE can - specify alternative tables at run time. - - (4) Ensure that you have the following header files: - - pcre_internal.h - ucp.h - ucpinternal.h - ucptable.h - - (5) Also ensure that you have the following file, which is #included as source - when building a debugging version of PCRE and is also used by pcretest. - - pcre_printint.src - - (6) Compile the following source files, setting -DHAVE_CONFIG_H as a compiler - option if you have set up config.h with your configuration, or else use - other -D settings to change the configuration as required. - - pcre_chartables.c - pcre_compile.c - pcre_config.c - pcre_dfa_exec.c - pcre_exec.c - pcre_fullinfo.c - pcre_get.c - pcre_globals.c - pcre_info.c - pcre_maketables.c - pcre_newline.c - pcre_ord2utf8.c - pcre_refcount.c - pcre_study.c - pcre_tables.c - pcre_try_flipped.c - pcre_ucp_searchfuncs.c - pcre_valid_utf8.c - pcre_version.c - pcre_xclass.c - - Make sure that you include -I. in the compiler command (or equivalent for - an unusual compiler) so that all included PCRE header files are first - sought in the current directory. Otherwise you run the risk of picking up - a previously-installed file from somewhere else. - - (7) Now link all the compiled code into an object library in whichever form - your system keeps such libraries. This is the basic PCRE C library. If - your system has static and shared libraries, you may have to do this once - for each type. - - (8) Similarly, compile pcreposix.c (remembering -DHAVE_CONFIG_H if necessary) - and link the result (on its own) as the pcreposix library. - - (9) Compile the test program pcretest.c (again, don't forget -DHAVE_CONFIG_H). - This needs the functions in the pcre and pcreposix libraries when linking. - It also needs the pcre_printint.src source file, which it #includes. - -(10) Run pcretest on the testinput files in the testdata directory, and check - that the output matches the corresponding testoutput files. Note that the - supplied files are in Unix format, with just LF characters as line - terminators. You may need to edit them to change this if your system uses - a different convention. If you are using Windows, you probably should use - the wintestinput3 file instead of testinput3 (and the corresponding output - file). This is a locale test; wintestinput3 sets the locale to "french" - rather than "fr_FR", and there some minor output differences. - -(11) If you want to use the pcregrep command, compile and link pcregrep.c; it - uses only the basic PCRE library (it does not need the pcreposix library). - - -THE C++ WRAPPER FUNCTIONS - -The PCRE distribution also contains some C++ wrapper functions and tests, -contributed by Google Inc. On a system that can use "configure" and "make", -the functions are automatically built into a library called pcrecpp. It should -be straightforward to compile the .cc files manually on other systems. The -files called xxx_unittest.cc are test programs for each of the corresponding -xxx.cc files. - - -BUILDING FOR VIRTUAL PASCAL +If you want to compile PCRE for a non-Unix system (or perhaps, more strictly, +for a system that does not support "configure" and "make" files), note that +PCRE consists entirely of code written in Standard C, and so should compile +successfully on any system that has a Standard C compiler and library. + + +GENERIC INSTRUCTIONS + +The following are generic comments about building PCRE. The interspersed +indented commands are suggestions from Mark Tetrode as to which commands you +might use on a Windows system to build a static library. + +(1) Copy or rename the file config.in as config.h, and change the macros that +define HAVE_STRERROR and HAVE_MEMMOVE to define them as 1 rather than 0. +Unfortunately, because of the way Unix autoconf works, the default setting has +to be 0. You may also want to make changes to other macros in config.h. In +particular, if you want to force a specific value for newline, you can define +the NEWLINE macro. The default is to use '\n', thereby using whatever value +your compiler gives to '\n'. + + rem Mark Tetrode's commands + copy config.in config.h + rem Use write, because notepad cannot handle UNIX files. Change values. + write config.h + +(2) Copy or rename the file pcre.in as pcre.h, and change the macro definitions +for PCRE_MAJOR, PCRE_MINOR, and PCRE_DATE near its start to the values set in +configure.in. + + rem Mark Tetrode's commands + copy pcre.in pcre.h + rem Read values from configure.in + write configure.in + rem Change values + write pcre.h + +(3) Compile dftables.c as a stand-alone program, and then run it with +the single argument "chartables.c". This generates a set of standard +character tables and writes them to that file. + + rem Mark Tetrode's commands + rem Compile & run + cl -DSUPPORT_UTF8 dftables.c + dftables.exe > chartables.c + +(4) Compile maketables.c, get.c, study.c and pcre.c and link them all +together into an object library in whichever form your system keeps such +libraries. This is the pcre library (chartables.c is included by means of an +#include directive). If your system has static and shared libraries, you may +have to do this once for each type. + + rem Mark Tetrode's commands, for a static library + rem Compile & lib + cl -DSUPPORT_UTF8 -DPOSIX_MALLOC_THRESHOLD=10 /c maketables.c get.c study.c pcre.c + lib /OUT:pcre.lib maketables.obj get.obj study.obj pcre.obj + +(5) Similarly, compile pcreposix.c and link it (on its own) as the pcreposix +library. + + rem Mark Tetrode's commands, for a static library + rem Compile & lib + cl -DSUPPORT_UTF8 -DPOSIX_MALLOC_THRESHOLD=10 /c pcreposix.c + lib /OUT:pcreposix.lib pcreposix.obj + +(6) Compile the test program pcretest.c. This needs the functions in the +pcre and pcreposix libraries when linking. + + rem Mark Tetrode's commands + rem compile & link + cl pcretest.c pcre.lib pcreposix.lib + +(7) Run pcretest on the testinput files in the testdata directory, and check +that the output matches the corresponding testoutput files. You must use the +-i option when checking testinput2. Note that the supplied files are in Unix +format, with just LF characters as line terminators. You may need to edit them +to change this if your system uses a different convention. + + rem Mark Tetrode's commands + rem Make a change, i.e. space, backspace, and save again - do this for all + rem to change UNIX to Win, \n to \n\r + write testoutput1 + write testoutput2 + write testoutput3 + write testoutput4 + write testoutput5 + pcretest testdata\testinput1 testdata\myoutput1 + windiff testdata\testoutput1 testdata\myoutput1 + pcretest -i testdata\testinput2 testdata\myoutput2 + windiff testdata\testoutput2 testdata\myoutput2 + pcretest testdata\testinput3 testdata\myoutput3 + windiff testdata\testoutput3 testdata\myoutput3 + pcretest testdata\testinput4 testdata\myoutput4 + windiff testdata\testoutput4 testdata\myoutput4 + pcretest testdata\testinput5 testdata\myoutput5 + windiff testdata\testoutput5 testdata\myoutput5 + + +FURTHER REMARKS + +If you have a system without "configure" but where you can use a Makefile, edit +Makefile.in to create Makefile, substituting suitable values for the variables +at the head of the file. + +Some help in building a Win32 DLL of PCRE in GnuWin32 environments was +contributed by Paul Sokolovsky. These environments are Mingw32 +(http://www.xraylith.wisc.edu/~khan/software/gnu-win32/) and CygWin +(http://sourceware.cygnus.com/cygwin/). Paul comments: + + For CygWin, set CFLAGS=-mno-cygwin, and do 'make dll'. You'll get + pcre.dll (containing pcreposix also), libpcre.dll.a, and dynamically + linked pgrep and pcretest. If you have /bin/sh, run RunTest (three + main test go ok, locale not supported). + +Changes to do MinGW with autoconf 2.50 were supplied by Fred Cox +, who comments as follows: + + If you are using the PCRE DLL, the normal Unix style configure && make && + make check && make install should just work[*]. If you want to statically + link against the .a file, you must define PCRE_STATIC before including + pcre.h, otherwise the pcre_malloc and pcre_free exported functions will be + declared __declspec(dllimport), with hilarious results. See the configure.in + and pcretest.c for how it is done for the static test. + + Also, there will only be a libpcre.la, not a libpcreposix.la, as you + would expect from the Unix version. The single DLL includes the pcreposix + interface. + +[*] But note that the supplied test files are in Unix format, with just LF +characters as line terminators. You will have to edit them to change to CR LF +terminators. A script for building PCRE using Borland's C++ compiler for use with VPASCAL -was contributed by Alexander Tokarev. Stefan Weber updated the script and added -additional files. The following files in the distribution are for building PCRE -for use with VP/Borland: makevp_c.txt, makevp_l.txt, makevp.bat, pcregexp.pas. - - -STACK SIZE IN WINDOWS ENVIRONMENTS - -The default processor stack size of 1Mb in some Windows environments is too -small for matching patterns that need much recursion. In particular, test 2 may -fail because of this. Normally, running out of stack causes a crash, but there -have been cases where the test program has just died silently. See your linker -documentation for how to increase stack size if you experience problems. The -Linux default of 8Mb is a reasonable choice for the stack, though even that can -be too small for some pattern/subject combinations. - -PCRE has a compile configuration option to disable the use of stack for -recursion so that heap is used instead. However, pattern matching is -significantly slower when this is done. There is more about stack usage in the -"pcrestack" documentation. - - -COMMENTS ABOUT WIN32 BUILDS (see also "BUILDING PCRE WITH CMAKE" below) - -There are two ways of building PCRE using the "configure, make, make install" -paradigm on Windows systems: using MinGW or using Cygwin. These are not at all -the same thing; they are completely different from each other. There is also -some experimental, undocumented support for building using "cmake", which you -might like to try if you are familiar with "cmake". However, at the present -time, the "cmake" process builds only a static library (not a dll), and the -tests are not automatically run. - -The MinGW home page (http://www.mingw.org/) says this: - - MinGW: A collection of freely available and freely distributable Windows - specific header files and import libraries combined with GNU toolsets that - allow one to produce native Windows programs that do not rely on any - 3rd-party C runtime DLLs. - -The Cygwin home page (http://www.cygwin.com/) says this: - - Cygwin is a Linux-like environment for Windows. It consists of two parts: - - . A DLL (cygwin1.dll) which acts as a Linux API emulation layer providing - substantial Linux API functionality - - . A collection of tools which provide Linux look and feel. - - The Cygwin DLL currently works with all recent, commercially released x86 32 - bit and 64 bit versions of Windows, with the exception of Windows CE. - -On both MinGW and Cygwin, PCRE should build correctly using: - - ./configure && make && make install - -This should create two libraries called libpcre and libpcreposix, and, if you -have enabled building the C++ wrapper, a third one called libpcrecpp. These are -independent libraries: when you like with libpcreposix or libpcrecpp you must -also link with libpcre, which contains the basic functions. (Some earlier -releases of PCRE included the basic libpcre functions in libpcreposix. This no -longer happens.) - -If you want to statically link your program against a non-dll .a file, you must -define PCRE_STATIC before including pcre.h, otherwise the pcre_malloc() and -pcre_free() exported functions will be declared __declspec(dllimport), with -unwanted results. - -Using Cygwin's compiler generates libraries and executables that depend on -cygwin1.dll. If a library that is generated this way is distributed, -cygwin1.dll has to be distributed as well. Since cygwin1.dll is under the GPL -licence, this forces not only PCRE to be under the GPL, but also the entire -application. A distributor who wants to keep their own code proprietary must -purchase an appropriate Cygwin licence. - -MinGW has no such restrictions. The MinGW compiler generates a library or -executable that can run standalone on Windows without any third party dll or -licensing issues. - -But there is more complication: +was contributed by Alexander Tokarev. It is called makevp.bat. -If a Cygwin user uses the -mno-cygwin Cygwin gcc flag, what that really does is -to tell Cygwin's gcc to use the MinGW gcc. Cygwin's gcc is only acting as a -front end to MinGW's gcc (if you install Cygwin's gcc, you get both Cygwin's -gcc and MinGW's gcc). So, a user can: - -. Build native binaries by using MinGW or by getting Cygwin and using - -mno-cygwin. - -. Build binaries that depend on cygwin1.dll by using Cygwin with the normal - compiler flags. - -The test files that are supplied with PCRE are in Unix format, with LF -characters as line terminators. It may be necessary to change the line -terminators in order to get some of the tests to work. We hope to improve -things in this area in future. - - -BUILDING PCRE WITH CMAKE - -CMake is an alternative build facility that can be used instead of the -traditional Unix "configure". CMake version 2.4.7 supports Borland makefiles, -MinGW makefiles, MSYS makefiles, NMake makefiles, UNIX makefiles, Visual Studio -6, Visual Studio 7, Visual Studio 8, and Watcom W8. The following instructions -were contributed by a PCRE user. - -1. Download CMake 2.4.7 or above from http://www.cmake.org/, install and ensure - that cmake\bin is on your path. - -2. Unzip (retaining folder structure) the PCRE source tree into a source - directory such as C:\pcre. - -3. Create a new, empty build directory: C:\pcre\build\ - -4. Run CMakeSetup from the Shell envirornment of your build tool, e.g., Msys - for Msys/MinGW or Visual Studio Command Prompt for VC/VC++ - -5. Enter C:\pcre\pcre-xx and C:\pcre\build for the source and build - directories, respectively - -6. Hit the "Configure" button. - -7. Select the particular IDE / build tool that you are using (Visual Studio, - MSYS makefiles, MinGW makefiles, etc.) - -8. The GUI will then list several configuration options. This is where you can - enable UTF-8 support, etc. - -9. Hit "Configure" again. The adjacent "OK" button should now be active. - -10. Hit "OK". - -11. The build directory should now contain a usable build system, be it a - solution file for Visual Studio, makefiles for MinGW, etc. - -Testing with RunTest.bat - -1. Copy RunTest.bat into the directory where pcretest.exe has been created. - -2. Edit RunTest.bat and insert a line that indentifies the relative location of - the pcre source, e.g.: - - set srcdir=..\pcre-7.4-RC3 - -3. Run RunTest.bat from a command shell environment. Test outputs will - automatically be compared to expected results, and discrepancies will - identified in the console output. - -4. To test pcrecpp, run pcrecpp_unittest.exe, pcre_stringpiece_unittest.exe and - pcre_scanner_unittest.exe. +These are some further comments about Win32 builds from Mark Evans. They +were contributed before Fred Cox's changes were made, so it is possible that +they may no longer be relevant. + +"The documentation for Win32 builds is a bit shy. Under MSVC6 I +followed their instructions to the letter, but there were still +some things missing. + +(1) Must #define STATIC for entire project if linking statically. + (I see no reason to use DLLs for code this compact.) This of + course is a project setting in MSVC under Preprocessor. + +(2) Missing some #ifdefs relating to the function pointers + pcre_malloc and pcre_free. See my solution below. (The stubs + may not be mandatory but they made me feel better.)" +========================= +#ifdef _WIN32 +#include -BUILDING UNDER WINDOWS WITH BCC5.5 +void* malloc_stub(size_t N) +{ return malloc(N); } +void free_stub(void* p) +{ free(p); } +void *(*pcre_malloc)(size_t) = &malloc_stub; +void (*pcre_free)(void *) = &free_stub; -Michael Roy sent these comments about building PCRE under Windows with BCC5.5: +#else - Some of the core BCC libraries have a version of PCRE from 1998 built in, - which can lead to pcre_exec() giving an erroneous PCRE_ERROR_NULL from a - version mismatch. I'm including an easy workaround below, if you'd like to - include it in the non-unix instructions: +void *(*pcre_malloc)(size_t) = malloc; +void (*pcre_free)(void *) = free; - When linking a project with BCC5.5, pcre.lib must be included before any of - the libraries cw32.lib, cw32i.lib, cw32mt.lib, and cw32mti.lib on the command - line. +#endif +========================= BUILDING PCRE ON OPENVMS -Dan Mooney sent the following comments about building PCRE on OpenVMS. They -relate to an older version of PCRE that used fewer source files, so the exact -commands will need changing. See the current list of source files above. +Dan Mooney sent the following comments about building PCRE on OpenVMS: "It was quite easy to compile and link the library. I don't have a formal make file but the attached file [reproduced below] contains the OpenVMS DCL @@ -384,5 +241,4 @@ $! ========================= -Last Updated: 21 September 2007 **** Modified: httpd/httpd/vendor/pcre/current/README URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/README?rev=598343&r1=598342&r2=598343&view=diff ============================================================================== --- httpd/httpd/vendor/pcre/current/README (original) +++ httpd/httpd/vendor/pcre/current/README Mon Nov 26 09:04:19 2007 @@ -5,82 +5,43 @@ ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz -There is a mailing list for discussion about the development of PCRE at - - pcre-dev@exim.org - Please read the NEWS file if you are upgrading from a previous release. -The contents of this README file are: - The PCRE APIs - Documentation for PCRE - Contributions by users of PCRE - Building PCRE on non-Unix systems - Building PCRE on Unix-like systems - Retrieving configuration information on Unix-like systems - Shared libraries on Unix-like systems - Cross-compiling on Unix-like systems - Using HP's ANSI C++ compiler (aCC) - Making new tarballs - Testing PCRE - Character tables - File manifest - - -The PCRE APIs -------------- - -PCRE is written in C, and it has its own API. The distribution also includes a -set of C++ wrapper functions (see the pcrecpp man page for details), courtesy -of Google Inc. - -In addition, there is a set of C wrapper functions that are based on the POSIX -regular expression API (see the pcreposix man page). These end up in the -library called libpcreposix. Note that this just provides a POSIX calling -interface to PCRE; the regular expressions themselves still follow Perl syntax -and semantics. The POSIX API is restricted, and does not give full access to -all of PCRE's facilities. - -The header file for the POSIX-style functions is called pcreposix.h. The -official POSIX name is regex.h, but I did not want to risk possible problems -with existing files of that name by distributing it that way. To use PCRE with -an existing program that uses the POSIX API, pcreposix.h will have to be -renamed or pointed at by a link. +PCRE has its own native API, but a set of "wrapper" functions that are based on +the POSIX API are also supplied in the library libpcreposix. Note that this +just provides a POSIX calling interface to PCRE: the regular expressions +themselves still follow Perl syntax and semantics. The header file +for the POSIX-style functions is called pcreposix.h. The official POSIX name is +regex.h, but I didn't want to risk possible problems with existing files of +that name by distributing it that way. To use it with an existing program that +uses the POSIX API, it will have to be renamed or pointed at by a link. If you are using the POSIX interface to PCRE and there is already a POSIX regex -library installed on your system, as well as worrying about the regex.h header -file (as mentioned above), you must also take care when linking programs to +library installed on your system, you must take care when linking programs to ensure that they link with PCRE's libpcreposix library. Otherwise they may pick -up the POSIX functions of the same name from the other library. - -One way of avoiding this confusion is to compile PCRE with the addition of --Dregcomp=PCREregcomp (and similarly for the other POSIX functions) to the -compiler flags (CFLAGS if you are using "configure" -- see below). This has the -effect of renaming the functions so that the names no longer clash. Of course, -you have to do the same thing for your applications, or write them using the -new names. +up the "real" POSIX functions of the same name. Documentation for PCRE ---------------------- -If you install PCRE in the normal way on a Unix-like system, you will end up -with a set of man pages whose names all start with "pcre". The one that is just -called "pcre" lists all the others. In addition to these man pages, the PCRE -documentation is supplied in two other forms: - - 1. There are files called doc/pcre.txt, doc/pcregrep.txt, and - doc/pcretest.txt in the source distribution. The first of these is a - concatenation of the text forms of all the section 3 man pages except - those that summarize individual functions. The other two are the text - forms of the section 1 man pages for the pcregrep and pcretest commands. - These text forms are provided for ease of scanning with text editors or - similar tools. They are installed in /share/doc/pcre, where - is the installation prefix (defaulting to /usr/local). - - 2. A set of files containing all the documentation in HTML form, hyperlinked - in various ways, and rooted in a file called index.html, is distributed in - doc/html and installed in /share/doc/pcre/html. +If you install PCRE in the normal way, you will end up with an installed set of +man pages whose names all start with "pcre". The one that is called "pcre" +lists all the others. In addition to these man pages, the PCRE documentation is +supplied in two other forms; however, as there is no standard place to install +them, they are left in the doc directory of the unpacked source distribution. +These forms are: + + 1. Files called doc/pcre.txt, doc/pcregrep.txt, and doc/pcretest.txt. The + first of these is a concatenation of the text forms of all the section 3 + man pages except those that summarize individual functions. The other two + are the text forms of the section 1 man pages for the pcregrep and + pcretest commands. Text forms are provided for ease of scanning with text + editors or similar tools. + + 2. A subdirectory called doc/html contains all the documentation in HTML + form, hyperlinked in various ways, and rooted in a file called + doc/index.html. Contributions by users of PCRE @@ -90,48 +51,24 @@ ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Contrib -There is a README file giving brief descriptions of what they are. Some are -complete in themselves; others are pointers to URLs containing relevant files. -Some of this material is likely to be well out-of-date. Several of the earlier -contributions provided support for compiling PCRE on various flavours of -Windows (I myself do not use Windows). Nowadays there is more Windows support -in the standard distribution, so these contibutions have been archived. - - -Building PCRE on non-Unix systems ---------------------------------- - -For a non-Unix system, please read the comments in the file NON-UNIX-USE, -though if your system supports the use of "configure" and "make" you may be -able to build PCRE in the same way as for Unix-like systems. PCRE can also be -configured in many platform environments using the GUI facility of CMake's -CMakeSetup. It creates Makefiles, solution files, etc. - -PCRE has been compiled on many different operating systems. It should be -straightforward to build PCRE on any system that has a Standard C compiler and -library, because it uses only Standard C functions. - - -Building PCRE on Unix-like systems ----------------------------------- - -If you are using HP's ANSI C++ compiler (aCC), please see the special note -in the section entitled "Using HP's ANSI C++ compiler (aCC)" below. - -The following instructions assume the use of the widely used "configure, make, -make install" process. There is also some experimental support for "cmake" in -the PCRE distribution, but it is incomplete and not documented. However, if you -are a "cmake" user, you might want to try it. +where there is also a README file giving brief descriptions of what they are. +Several of them provide support for compiling PCRE on various flavours of +Windows systems (I myself do not use Windows). Some are complete in themselves; +others are pointers to URLs containing relevant files. + + +Building PCRE on a Unix-like system +----------------------------------- To build PCRE on a Unix-like system, first run the "configure" command from the PCRE distribution directory, with your current directory set to the directory where you want the files to be created. This command is a standard GNU "autoconf" configuration script, for which generic instructions are supplied in -the file INSTALL. +INSTALL. Most commonly, people build PCRE within its own distribution directory, and in -this case, on many systems, just running "./configure" is sufficient. However, -the usual methods of changing standard defaults are available. For example: +this case, on many systems, just running "./configure" is sufficient, but the +usual methods of changing standard defaults are available. For example: CFLAGS='-O2 -Wall' ./configure --prefix=/opt/local @@ -146,18 +83,9 @@ cd /build/pcre/pcre-xxx /source/pcre/pcre-xxx/configure -PCRE is written in C and is normally compiled as a C library. However, it is -possible to build it as a C++ library, though the provided building apparatus -does not have any features to support this. - There are some optional features that can be included or omitted from the PCRE library. You can read more about them in the pcrebuild man page. -. If you want to suppress the building of the C++ wrapper library, you can add - --disable-cpp to the "configure" command. Otherwise, when "configure" is run, - it will try to find a C++ compiler and C++ header files, and if it succeeds, - it will try to build the C++ wrapper. - . If you want to make use of the support for UTF-8 character strings in PCRE, you must add --enable-utf8 to the "configure" command. Without it, the code for handling UTF-8 is not included in the library. (Even when included, it @@ -166,195 +94,75 @@ . If, in addition to support for UTF-8 character strings, you want to include support for the \P, \p, and \X sequences that recognize Unicode character properties, you must add --enable-unicode-properties to the "configure" - command. This adds about 30K to the size of the library (in the form of a + command. This adds about 90K to the size of the library (in the form of a property table); only the basic two-letter properties such as Lu are supported. -. You can build PCRE to recognize either CR or LF or the sequence CRLF or any - of the preceding, or any of the Unicode newline sequences as indicating the - end of a line. Whatever you specify at build time is the default; the caller - of PCRE can change the selection at run time. The default newline indicator - is a single LF character (the Unix standard). You can specify the default - newline indicator by adding --enable-newline-is-cr or --enable-newline-is-lf - or --enable-newline-is-crlf or --enable-newline-is-anycrlf or - --enable-newline-is-any to the "configure" command, respectively. - - If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of - the standard tests will fail, because the lines in the test files end with - LF. Even if the files are edited to change the line endings, there are likely - to be some failures. With --enable-newline-is-anycrlf or - --enable-newline-is-any, many tests should succeed, but there may be some - failures. - -. By default, the sequence \R in a pattern matches any Unicode line ending - sequence. This is independent of the option specifying what PCRE considers to - be the end of a line (see above). However, the caller of PCRE can restrict \R - to match only CR, LF, or CRLF. You can make this the default by adding - --enable-bsr-anycrlf to the "configure" command (bsr = "backslash R"). +. You can build PCRE to recognized CR or NL as the newline character, instead + of whatever your compiler uses for "\n", by adding --newline-is-cr or + --newline-is-nl to the "configure" command, respectively. Only do this if you + really understand what you are doing. On traditional Unix-like systems, the + newline character is NL. . When called via the POSIX interface, PCRE uses malloc() to get additional storage for processing capturing parentheses if there are more than 10 of - them in a pattern. You can increase this threshold by setting, for example, + them. You can increase this threshold by setting, for example, --with-posix-malloc-threshold=20 on the "configure" command. -. PCRE has a counter that can be set to limit the amount of resources it uses. +. PCRE has a counter which can be set to limit the amount of resources it uses. If the limit is exceeded during a match, the match fails. The default is ten million. You can change the default by setting, for example, --with-match-limit=500000 on the "configure" command. This is just the default; individual calls to - pcre_exec() can supply their own value. There is more discussion on the - pcreapi man page. - -. There is a separate counter that limits the depth of recursive function calls - during a matching process. This also has a default of ten million, which is - essentially "unlimited". You can change the default by setting, for example, - - --with-match-limit-recursion=500000 - - Recursive function calls use up the runtime stack; running out of stack can - cause programs to crash in strange ways. There is a discussion about stack - sizes in the pcrestack man page. + pcre_exec() can supply their own value. There is discussion on the pcreapi + man page. . The default maximum compiled pattern size is around 64K. You can increase this by adding --with-link-size=3 to the "configure" command. You can increase it even more by setting --with-link-size=4, but this is unlikely - ever to be necessary. Increasing the internal link size will reduce - performance. - -. You can build PCRE so that its internal match() function that is called from - pcre_exec() does not call itself recursively. Instead, it uses memory blocks - obtained from the heap via the special functions pcre_stack_malloc() and - pcre_stack_free() to save data that would otherwise be saved on the stack. To - build PCRE like this, use + ever to be necessary. If you build PCRE with an increased link size, test 2 + (and 5 if you are using UTF-8) will fail. Part of the output of these tests + is a representation of the compiled pattern, and this changes with the link + size. + +. You can build PCRE so that its match() function does not call itself + recursively. Instead, it uses blocks of data from the heap via special + functions pcre_stack_malloc() and pcre_stack_free() to save data that would + otherwise be saved on the stack. To build PCRE like this, use --disable-stack-for-recursion on the "configure" command. PCRE runs more slowly in this mode, but it may be - necessary in environments with limited stack sizes. This applies only to the - pcre_exec() function; it does not apply to pcre_dfa_exec(), which does not - use deeply nested recursion. There is a discussion about stack sizes in the - pcrestack man page. - -. For speed, PCRE uses four tables for manipulating and identifying characters - whose code point values are less than 256. By default, it uses a set of - tables for ASCII encoding that is part of the distribution. If you specify - - --enable-rebuild-chartables - - a program called dftables is compiled and run in the default C locale when - you obey "make". It builds a source file called pcre_chartables.c. If you do - not specify this option, pcre_chartables.c is created as a copy of - pcre_chartables.c.dist. See "Character tables" below for further information. + necessary in environments with limited stack sizes. -. It is possible to compile PCRE for use on systems that use EBCDIC as their - default character code (as opposed to ASCII) by specifying +The "configure" script builds seven files: - --enable-ebcdic - - This automatically implies --enable-rebuild-chartables (see above). - -The "configure" script builds the following files for the basic C library: - -. Makefile is the makefile that builds the library -. config.h contains build-time configuration options for the library -. pcre.h is the public PCRE header file -. pcre-config is a script that shows the settings of "configure" options -. libpcre.pc is data for the pkg-config command +. pcre.h is build by copying pcre.in and making substitutions +. Makefile is built by copying Makefile.in and making substitutions. +. config.h is built by copying config.in and making substitutions. +. pcre-config is built by copying pcre-config.in and making substitutions. +. libpcre.pc is data for the pkg-config command, built from libpcre.pc.in . libtool is a script that builds shared and/or static libraries -. RunTest is a script for running tests on the basic C library -. RunGrepTest is a script for running tests on the pcregrep command +. RunTest is a script for running tests -Versions of config.h and pcre.h are distributed in the PCRE tarballs under -the names config.h.generic and pcre.h.generic. These are provided for the -benefit of those who have to built PCRE without the benefit of "configure". If -you use "configure", the .generic versions are not used. - -If a C++ compiler is found, the following files are also built: - -. libpcrecpp.pc is data for the pkg-config command -. pcrecpparg.h is a header file for programs that call PCRE via the C++ wrapper -. pcre_stringpiece.h is the header for the C++ "stringpiece" functions - -The "configure" script also creates config.status, which is an executable -script that can be run to recreate the configuration, and config.log, which -contains compiler output from tests that "configure" runs. - -Once "configure" has run, you can run "make". It builds two libraries, called -libpcre and libpcreposix, a test program called pcretest, a demonstration -program called pcredemo, and the pcregrep command. If a C++ compiler was found -on your system, "make" also builds the C++ wrapper library, which is called -libpcrecpp, and some test programs called pcrecpp_unittest, -pcre_scanner_unittest, and pcre_stringpiece_unittest. Building the C++ wrapper -can be disabled by adding --disable-cpp to the "configure" command. - -The command "make check" runs all the appropriate tests. Details of the PCRE -tests are given below in a separate section of this document. - -You can use "make install" to install PCRE into live directories on your -system. The following are installed (file names are all relative to the - that is set when "configure" is run): - - Commands (bin): - pcretest - pcregrep - pcre-config - - Libraries (lib): - libpcre - libpcreposix - libpcrecpp (if C++ support is enabled) - - Configuration information (lib/pkgconfig): - libpcre.pc - libpcrecpp.pc (if C++ support is enabled) - - Header files (include): - pcre.h - pcreposix.h - pcre_scanner.h ) - pcre_stringpiece.h ) if C++ support is enabled - pcrecpp.h ) - pcrecpparg.h ) - - Man pages (share/man/man{1,3}): - pcregrep.1 - pcretest.1 - pcre.3 - pcre*.3 (lots more pages, all starting "pcre") - - HTML documentation (share/doc/pcre/html): - index.html - *.html (lots more pages, hyperlinked from index.html) - - Text file documentation (share/doc/pcre): - AUTHORS - COPYING - ChangeLog - LICENCE - NEWS - README - pcre.txt (a concatenation of the man(3) pages) - pcretest.txt the pcretest man page - pcregrep.txt the pcregrep man page - -Note that the pcredemo program that is built by "configure" is *not* installed -anywhere. It is a demonstration for programmers wanting to use PCRE. - -If you want to remove PCRE from your system, you can run "make uninstall". -This removes all the files that "make install" installed. However, it does not -remove any directories, because these are often shared with other programs. +Once "configure" has run, you can run "make". It builds two libraries called +libpcre and libpcreposix, a test program called pcretest, and the pcregrep +command. You can use "make install" to copy these, the public header files +pcre.h and pcreposix.h, and the man pages to appropriate live directories on +your system, in the normal way. Retrieving configuration information on Unix-like systems --------------------------------------------------------- -Running "make install" installs the command pcre-config, which can be used to -recall information about the PCRE configuration and installation. For example: +Running "make install" also installs the command pcre-config, which can be used +to recall information about the PCRE configuration and installation. For +example: pcre-config --version @@ -373,15 +181,15 @@ pkg-config --cflags pcre The data is held in *.pc files that are installed in a directory called -/lib/pkgconfig. +pkgconfig. Shared libraries on Unix-like systems ------------------------------------- -The default distribution builds PCRE as shared libraries and static libraries, -as long as the operating system supports shared libraries. Shared library -support relies on the "libtool" script which is built as part of the +The default distribution builds PCRE as two shared libraries and two static +libraries, as long as the operating system supports shared libraries. Shared +library support relies on the "libtool" script which is built as part of the "configure" process. The libtool script is used to compile and link both shared and static @@ -390,7 +198,7 @@ libraries (by means of wrapper scripts in the case of shared libraries). When you use "make install" to install shared libraries, pcregrep and pcretest are automatically re-built to use the newly installed shared libraries before being -installed themselves. However, the versions left in the build directory still +installed themselves. However, the versions left in the source directory still use the uninstalled libraries. To build PCRE using static libraries only you must use --disable-shared when @@ -402,86 +210,56 @@ build only shared libraries. -Cross-compiling on Unix-like systems ------------------------------------- +Cross-compiling on a Unix-like system +------------------------------------- You can specify CC and CFLAGS in the normal way to the "configure" command, in -order to cross-compile PCRE for some other host. However, you should NOT -specify --enable-rebuild-chartables, because if you do, the dftables.c source -file is compiled and run on the local host, in order to generate the inbuilt -character tables (the pcre_chartables.c file). This will probably not work, -because dftables.c needs to be compiled with the local compiler, not the cross -compiler. - -When --enable-rebuild-chartables is not specified, pcre_chartables.c is created -by making a copy of pcre_chartables.c.dist, which is a default set of tables -that assumes ASCII code. Cross-compiling with the default tables should not be -a problem. - -If you need to modify the character tables when cross-compiling, you should -move pcre_chartables.c.dist out of the way, then compile dftables.c by hand and -run it on the local host to make a new version of pcre_chartables.c.dist. -Then when you cross-compile PCRE this new version of the tables will be used. - - -Using HP's ANSI C++ compiler (aCC) ----------------------------------- - -Unless C++ support is disabled by specifying the "--disable-cpp" option of the -"configure" script, you must include the "-AA" option in the CXXFLAGS -environment variable in order for the C++ components to compile correctly. - -Also, note that the aCC compiler on PA-RISC platforms may have a defect whereby -needed libraries fail to get included when specifying the "-AA" compiler -option. If you experience unresolved symbols when linking the C++ programs, -use the workaround of specifying the following environment variable prior to -running the "configure" script: - - CXXLDFLAGS="-lstd_v2 -lCsup_v2" - - -Making new tarballs -------------------- - -The command "make dist" creates three PCRE tarballs, in tar.gz, tar.bz2, and -zip formats. The command "make distcheck" does the same, but then does a trial -build of the new distribution to ensure that it works. - -If you have modified any of the man page sources in the doc directory, you -should first run the PrepareRelease script before making a distribution. This -script creates the .txt and HTML forms of the documentation from the man pages. +order to cross-compile PCRE for some other host. However, during the building +process, the dftables.c source file is compiled *and run* on the local host, in +order to generate the default character tables (the chartables.c file). It +therefore needs to be compiled with the local compiler, not the cross compiler. +You can do this by specifying CC_FOR_BUILD (and if necessary CFLAGS_FOR_BUILD) +when calling the "configure" command. If they are not specified, they default +to the values of CC and CFLAGS. + + +Building on non-Unix systems +---------------------------- + +For a non-Unix system, read the comments in the file NON-UNIX-USE, though if +the system supports the use of "configure" and "make" you may be able to build +PCRE in the same way as for Unix systems. + +PCRE has been compiled on Windows systems and on Macintoshes, but I don't know +the details because I don't use those systems. It should be straightforward to +build PCRE on any system that has a Standard C compiler, because it uses only +Standard C functions. Testing PCRE ------------ -To test the basic PCRE library on a Unix system, run the RunTest script that is -created by the configuring process. There is also a script called RunGrepTest -that tests the options of the pcregrep command. If the C++ wrapper library is -built, three test programs called pcrecpp_unittest, pcre_scanner_unittest, and -pcre_stringpiece_unittest are also built. - -Both the scripts and all the program tests are run if you obey "make check" or -"make test". For other systems, see the instructions in NON-UNIX-USE. - -The RunTest script runs the pcretest test program (which is documented in its -own man page) on each of the testinput files in the testdata directory in -turn, and compares the output with the contents of the corresponding testoutput -files. A file called testtry is used to hold the main output from pcretest +To test PCRE on a Unix system, run the RunTest script that is created by the +configuring process. (This can also be run by "make runtest", "make check", or +"make test".) For other systems, see the instructions in NON-UNIX-USE. + +The script runs the pcretest test program (which is documented in its own man +page) on each of the testinput files (in the testdata directory) in turn, +and compares the output with the contents of the corresponding testoutput file. +A file called testtry is used to hold the main output from pcretest (testsavedregex is also used as a working file). To run pcretest on just one of the test files, give its number as an argument to RunTest, for example: RunTest 2 -The first test file can also be fed directly into the perltest.pl script to -check that Perl gives the same results. The only difference you should see is -in the first few lines, where the Perl version is given instead of the PCRE -version. +The first file can also be fed directly into the perltest script to check that +Perl gives the same results. The only difference you should see is in the first +few lines, where the Perl version is given instead of the PCRE version. The second set of tests check pcre_fullinfo(), pcre_info(), pcre_study(), pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error detection, and run-time flags that are specific to PCRE, as well as the POSIX -wrapper API. It also uses the debugging flags to check some of the internals of +wrapper API. It also uses the debugging flag to check some of the internals of pcre_compile(). If you build PCRE with a locale setting that is not the standard C locale, the @@ -507,12 +285,6 @@ in the comparison output, it means that locale is not available on your system, despite being listed by "locale". This does not mean that PCRE is broken. -[If you are trying to run this test on Windows, you may be able to get it to -work by changing "fr_FR" to "french" everywhere it occurs. Alternatively, use -RunTest.bat. The version of RunTest.bat included with PCRE 7.4 and above uses -Windows versions of test 2. More info on using RunTest.bat is included in the -document entitled NON-UNIX-USE.] - The fourth test checks the UTF-8 support. It is not run automatically unless PCRE is built with UTF-8 support. To do this you must set --enable-utf8 when running "configure". This file can be also fed directly to the perltest script, @@ -522,55 +294,35 @@ The fifth test checks error handling with UTF-8 encoding, and internal UTF-8 features of PCRE that are not relevant to Perl. -The sixth test checks the support for Unicode character properties. It it not -run automatically unless PCRE is built with Unicode property support. To to -this you must set --enable-unicode-properties when running "configure". - -The seventh, eighth, and ninth tests check the pcre_dfa_exec() alternative -matching function, in non-UTF-8 mode, UTF-8 mode, and UTF-8 mode with Unicode -property support, respectively. The eighth and ninth tests are not run -automatically unless PCRE is build with the relevant support. +The sixth and final test checks the support for Unicode character properties. +It it not run automatically unless PCRE is built with Unicode property support. +To to this you must set --enable-unicode-properties when running "configure". Character tables ---------------- -For speed, PCRE uses four tables for manipulating and identifying characters -whose code point values are less than 256. The final argument of the -pcre_compile() function is a pointer to a block of memory containing the -concatenated tables. A call to pcre_maketables() can be used to generate a set -of tables in the current locale. If the final argument for pcre_compile() is -passed as NULL, a set of default tables that is built into the binary is used. - -The source file called pcre_chartables.c contains the default set of tables. By -default, this is created as a copy of pcre_chartables.c.dist, which contains -tables for ASCII coding. However, if --enable-rebuild-chartables is specified -for ./configure, a different version of pcre_chartables.c is built by the -program dftables (compiled from dftables.c), which uses the ANSI C character -handling functions such as isalnum(), isalpha(), isupper(), islower(), etc. to -build the table sources. This means that the default C locale which is set for -your system will control the contents of these default tables. You can change -the default tables by editing pcre_chartables.c and then re-building PCRE. If -you do this, you should take care to ensure that the file does not get -automatically re-generated. The best way to do this is to move -pcre_chartables.c.dist out of the way and replace it with your customized -tables. - -When the dftables program is run as a result of --enable-rebuild-chartables, -it uses the default C locale that is set on your system. It does not pay -attention to the LC_xxx environment variables. In other words, it uses the -system's default locale rather than whatever the compiling user happens to have -set. If you really do want to build a source set of character tables in a -locale that is specified by the LC_xxx variables, you can run the dftables -program by hand with the -L option. For example: - - ./dftables -L pcre_chartables.c.special +PCRE uses four tables for manipulating and identifying characters whose values +are less than 256. The final argument of the pcre_compile() function is a +pointer to a block of memory containing the concatenated tables. A call to +pcre_maketables() can be used to generate a set of tables in the current +locale. If the final argument for pcre_compile() is passed as NULL, a set of +default tables that is built into the binary is used. + +The source file called chartables.c contains the default set of tables. This is +not supplied in the distribution, but is built by the program dftables +(compiled from dftables.c), which uses the ANSI C character handling functions +such as isalnum(), isalpha(), isupper(), islower(), etc. to build the table +sources. This means that the default C locale which is set for your system will +control the contents of these default tables. You can change the default tables +by editing chartables.c and then re-building PCRE. If you do this, you should +probably also edit Makefile to ensure that the file doesn't ever get +re-generated. The first two 256-byte tables provide lower casing and case flipping functions, respectively. The next table consists of three 32-byte bit maps which identify digits, "word" characters, and white space, respectively. These are used when -building 32-byte bit maps that represent character classes for code points less -than 256. +building 32-byte bit maps that represent character classes. The final 256-byte table has bits indicating various character types, as follows: @@ -586,143 +338,90 @@ will cause PCRE to malfunction. -File manifest -------------- +Manifest +-------- The distribution should contain the following files: -(A) Source files of the PCRE library functions and their headers: +(A) The actual source files of the PCRE library functions and their + headers: - dftables.c auxiliary program for building pcre_chartables.c - when --enable-rebuild-chartables is specified + dftables.c auxiliary program for building chartables.c - pcre_chartables.c.dist a default set of character tables that assume ASCII - coding; used, unless --enable-rebuild-chartables is - specified, by copying to pcre_chartables.c - - pcreposix.c ) - pcre_compile.c ) - pcre_config.c ) - pcre_dfa_exec.c ) - pcre_exec.c ) - pcre_fullinfo.c ) - pcre_get.c ) sources for the functions in the library, - pcre_globals.c ) and some internal functions that they use - pcre_info.c ) - pcre_maketables.c ) - pcre_newline.c ) - pcre_ord2utf8.c ) - pcre_refcount.c ) - pcre_study.c ) - pcre_tables.c ) - pcre_try_flipped.c ) - pcre_ucp_searchfuncs.c ) - pcre_valid_utf8.c ) - pcre_version.c ) - pcre_xclass.c ) - pcre_printint.src ) debugging function that is #included in pcretest, - ) and can also be #included in pcre_compile() - pcre.h.in template for pcre.h when built by "configure" - pcreposix.h header for the external POSIX wrapper API - pcre_internal.h header for internal use - ucp.h ) headers concerned with - ucpinternal.h ) Unicode property handling - ucptable.h ) (this one is the data table) - - config.h.in template for config.h, which is built by "configure" - - pcrecpp.h public header file for the C++ wrapper - pcrecpparg.h.in template for another C++ header file - pcre_scanner.h public header file for C++ scanner functions - pcrecpp.cc ) - pcre_scanner.cc ) source for the C++ wrapper library - - pcre_stringpiece.h.in template for pcre_stringpiece.h, the header for the - C++ stringpiece functions - pcre_stringpiece.cc source for the C++ stringpiece functions - -(B) Source files for programs that use PCRE: - - pcredemo.c simple demonstration of coding calls to PCRE - pcregrep.c source of a grep utility that uses PCRE - pcretest.c comprehensive test program - -(C) Auxiliary files: - - 132html script to turn "man" pages into HTML - AUTHORS information about the author of PCRE - ChangeLog log of changes to the code - CleanTxt script to clean nroff output for txt man pages - Detrail script to remove trailing spaces - HACKING some notes about the internals of PCRE - INSTALL generic installation instructions - LICENCE conditions for the use of PCRE - COPYING the same, using GNU's standard name - Makefile.in ) template for Unix Makefile, which is built by - ) "configure" - Makefile.am ) the automake input that was used to create - ) Makefile.in - NEWS important changes in this release - NON-UNIX-USE notes on building PCRE on non-Unix systems - PrepareRelease script to make preparations for "make dist" - README this file - RunTest a Unix shell script for running tests - RunGrepTest a Unix shell script for pcregrep tests - aclocal.m4 m4 macros (generated by "aclocal") - config.guess ) files used by libtool, - config.sub ) used only when building a shared library - configure a configuring shell script (built by autoconf) - configure.ac ) the autoconf input that was used to build - ) "configure" and config.h - depcomp ) script to find program dependencies, generated by - ) automake - doc/*.3 man page sources for the PCRE functions - doc/*.1 man page sources for pcregrep and pcretest - doc/index.html.src the base HTML page - doc/html/* HTML documentation - doc/pcre.txt plain text version of the man pages - doc/pcretest.txt plain text documentation of test program - doc/perltest.txt plain text documentation of Perl test program - install-sh a shell script for installing files - libpcre.pc.in template for libpcre.pc for pkg-config - libpcrecpp.pc.in template for libpcrecpp.pc for pkg-config - ltmain.sh file used to build a libtool script - missing ) common stub for a few missing GNU programs while - ) installing, generated by automake - mkinstalldirs script for making install directories - perltest.pl Perl test program - pcre-config.in source of script which retains PCRE information - pcrecpp_unittest.cc ) - pcre_scanner_unittest.cc ) test programs for the C++ wrapper - pcre_stringpiece_unittest.cc ) - testdata/testinput* test data for main library tests - testdata/testoutput* expected test results - testdata/grep* input and output for pcregrep tests + get.c ) + maketables.c ) + study.c ) source of the functions + pcre.c ) in the library + pcreposix.c ) + printint.c ) + + ucp.c ) + ucp.h ) source for the code that is used for + ucpinternal.h ) Unicode property handling + ucptable.c ) + ucptypetable.c ) + + pcre.in "source" for the header for the external API; pcre.h + is built from this by "configure" + pcreposix.h header for the external POSIX wrapper API + internal.h header for internal use + config.in template for config.h, which is built by configure + +(B) Auxiliary files: + + AUTHORS information about the author of PCRE + ChangeLog log of changes to the code + INSTALL generic installation instructions + LICENCE conditions for the use of PCRE + COPYING the same, using GNU's standard name + Makefile.in template for Unix Makefile, which is built by configure + NEWS important changes in this release + NON-UNIX-USE notes on building PCRE on non-Unix systems + README this file + RunTest.in template for a Unix shell script for running tests + config.guess ) files used by libtool, + config.sub ) used only when building a shared library + configure a configuring shell script (built by autoconf) + configure.in the autoconf input used to build configure + doc/Tech.Notes notes on the encoding + doc/*.3 man page sources for the PCRE functions + doc/*.1 man page sources for pcregrep and pcretest + doc/html/* HTML documentation + doc/pcre.txt plain text version of the man pages + doc/pcretest.txt plain text documentation of test program + doc/perltest.txt plain text documentation of Perl test program + install-sh a shell script for installing files + libpcre.pc.in "source" for libpcre.pc for pkg-config + ltmain.sh file used to build a libtool script + mkinstalldirs script for making install directories + pcretest.c comprehensive test program + pcredemo.c simple demonstration of coding calls to PCRE + perltest Perl test program + pcregrep.c source of a grep utility that uses PCRE + pcre-config.in source of script which retains PCRE information + testdata/testinput1 test data, compatible with Perl + testdata/testinput2 test data for error messages and non-Perl things + testdata/testinput3 test data for locale-specific tests + testdata/testinput4 test data for UTF-8 tests compatible with Perl + testdata/testinput5 test data for other UTF-8 tests + testdata/testinput6 test data for Unicode property support tests + testdata/testoutput1 test results corresponding to testinput1 + testdata/testoutput2 test results corresponding to testinput2 + testdata/testoutput3 test results corresponding to testinput3 + testdata/testoutput4 test results corresponding to testinput4 + testdata/testoutput5 test results corresponding to testinput5 + testdata/testoutput6 test results corresponding to testinput6 + +(C) Auxiliary files for Win32 DLL + + dll.mk + libpcre.def + libpcreposix.def + pcre.def -(D) Auxiliary files for cmake support - - CMakeLists.txt - config-cmake.h.in - -(E) Auxiliary files for VPASCAL +(D) Auxiliary file for VPASCAL makevp.bat - makevp_c.txt - makevp_l.txt - pcregexp.pas - -(F) Auxiliary files for building PCRE "by hand" - - pcre.h.generic ) a version of the public PCRE header file - ) for use in non-"configure" environments - config.h.generic ) a version of config.h for use in non-"configure" - ) environments - -(F) Miscellaneous - - RunTest.bat a script for running tests under Windows -Philip Hazel -Email local part: ph10 -Email domain: cam.ac.uk -Last updated: 21 September 2007 +Philip Hazel +September 2004