jakarta-regexp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 34548] New: - RE constructor is not thread safe
Date Thu, 21 Apr 2005 05:53:57 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=34548>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=34548

           Summary: RE constructor is not thread safe
           Product: Regexp
           Version: unspecified
          Platform: Other
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Other
        AssignedTo: regexp-dev@jakarta.apache.org
        ReportedBy: james.cherryh@defence.gov.au


We discovered that the RE constructor "new RE(pattern)" is not thread safe.

I've attached a class which creates 1600 RE instances in their own threads
running at random times and which illustrates the problem. The sample class
includes an synchronisation wrapper which if invoked synchronizes the RE
constructors and makes the problem go away.

We might be submitting a patch for this; suggest anyone contact me before doing
any work to fix this bug.

To invoke the test class use the following command line.

Usage : 
RegexSynchronizer -nosynch -nowrapper <delay>

Example showing errors :
RegexSynchronizer -nosynch -nowrapper 10

Example showing fewer errors because of lesser collisions :
RegexSynchronizer -nosynch -nowrapper 30000

Example showing no errors because we wrap the RE usage :
RegexSynchronizer -nosynch -wrapper 10

Options :
-nosynch = runs tests without internal synchronisation
-synch = runs tests with internal synchronisation
-nowrapper = runs tests without internal synch wrapper
-wrapper = runs tests with internal wrapper
<delay> = internal random delay for each test; lower means more collisions


Code follows :
===============
package utility;

import org.apache.regexp.*;
import java.util.*;

public class RegexSynchronizer {

    private static final String NO_SYNCH = "-nosynch";
    private static final String SYNCH = "-synch";
    private static final String NO_WRAPPER = "-nowrapper";
    private static final String WRAPPER = "-wrapper";


    public static synchronized RE getRegex(String thePattern) throws
RESyntaxException {

        RE r = new RE(thePattern);

        return(r);
    }




    /**
     * Run some regexs in parallel to prove that the regex RE class is not
thread safe.
     *
     * @param args .
     */
    public static void main(String args[]) {

        boolean showUsage = false;

        if (args.length == 0)
            showUsage = true;

        boolean synch = false;
        boolean wrap = false;
        int delay = 10;

        for (int i = 0; i < args.length; i++) {

            String arg = args[i];

            if (arg.equals(NO_SYNCH))
                synch = false;
            else if (arg.equals(SYNCH))
                synch = true;
            else if (arg.equals(NO_WRAPPER))
                wrap = false;
            else if (arg.equals(WRAPPER))
                wrap = true;
            else {
                try {
                    int d = Integer.parseInt(arg);
                    delay = d;
                }
                catch (NumberFormatException nfe) {
                    showUsage = true;
                }
            }
        }


        if (showUsage) {
            System.out.println("Usage : \n" +
                               "RegexSynchronizer " + NO_SYNCH + " " +
NO_WRAPPER + " <delay>\n" +
                               "\n" +
                               "Example showing errors :\n" +
                               "RegexSynchronizer " + NO_SYNCH + " " +
NO_WRAPPER + " 10\n" +
                               "\n" +
                               "Example showing fewer errors because of lesser
collisions :\n" +
                               "RegexSynchronizer " + NO_SYNCH + " " +
NO_WRAPPER + " 30000\n" +
                               "\n" +
                               "Example showing no errors because we wrap the RE
usage :\n" +
                               "RegexSynchronizer " + NO_SYNCH + " " + WRAPPER +
" 10\n" +
                               "\n" +
                               "Options :\n" +
                               NO_SYNCH + " = runs tests without internal
synchronisation\n" +
                               SYNCH + " = runs tests with internal
synchronisation\n" +
                               NO_WRAPPER + " = runs tests without internal
synch wrapper\n" +
                               WRAPPER + " = runs tests with internal wrapper\n" +
                               "<delay> = internal random delay for each test;
lower means more collisions"
                               );
            System.exit(-1);
        }

        System.out.println("Synch=" + synch + ", wrap=" + wrap + ", delay=" +
delay);

        // To demonstrate the threading bug use the following settings :
        //   doSynchronised = false
        //   useOurSynchWrapper = false
        //   sleepDelay = 10



        // Change from false/true to test the two cases synchronised and not
synchronised in the test code
        final boolean doSynchronised = synch;

        // Change from true/false to test the native class or our synchronized
wrapper
        final boolean useOurSynchWrapper = wrap;


        // The amount of time the test threads sleep before starting. Increasing
this reduces the likelihood of a threading problem
        // and hence reduces the number of errors we find.
        final int sleepDelay = delay;



        // Some sample regexs to try compiling
        String[] regexs = {"[A-Z]{3}:", "<TAG[^>]*>(.*?)</TAG>", "^[ \t]+",
                         
"\\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b",
                          "(A:\\d{1,3}\\.){3}\\d{1,3}\\b",
                          "(m/[0-9]{2}[\\/|-][0-9]{2}[\\/|-][0-9]{4}/)",
                          "<a +href=\"http://([\\w\\.-]+)",
                         
"a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(<[b^>]*>)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c",
                         
"a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(<[b^>]*>)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c",
                         
"a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(<[b^>]*>)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c",
                         
"a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(<[b^>]*>)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c",
                         
"a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(a(<[b^>]*>)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c)c",
                          "(A{1,5}){4}B{1,5}",
                          "(A{1,5}){4}B{1,5}",
                          "(A{1,5}){4}B{1,5}",
                          "(A{1,5}){4}B{1,5}",
                          "(A{1,5}){4}B{1,5}",
                          "(A{1,5}){4}B{1,5}"
        };


        // The number of test loops we do
        int numberOfTestLoops = 100;


        // ArrayList of test threads waiting to start
        ArrayList threads = new ArrayList();


        // Random instance for thread sleep time
        final Random rand = new Random();

        // Say we have started
        System.out.println("Start");

        // Loop for the tests
        for (int i = 0; i < numberOfTestLoops; i++) {

            // For each test we compile each of our regexs
            for (int j = 0; j < regexs.length; j++) {

            final String pattern = regexs[j];

            // Create a new thread to compile this regex
            Thread t = new Thread() {


                // Test method to compile the RE
                private void getRe() {
                    int sleep = rand.nextInt(sleepDelay);

                    try {
                        // Randomise the start time for this thread so we get a
different mix of tests each time through
                        Thread.sleep(sleep);

                        // Either use our wrapper or use the raw RE compiler
                        if (useOurSynchWrapper) {
                            RE r = RegexSynchronizer.getRegex(pattern);
                        } else {
                            RE r = new RE(pattern);
                        }

                    }
                    catch (Throwable e) {
                        // We should ideally never get here...
                        System.out.println(e.toString() + ", " + pattern);
                    }

                }

                // Run method for the thread
                public void run() {

                    // If we're doing a synchronised test then synch here
                    if (doSynchronised) {
                        synchronized (Thread.class) {
                            getRe();
                        }
                    } else {
                        // Otherwise do the unsynch test
                        getRe();
                    }

                }

            };

            // Store the thread ready to start them all together
            threads.add(t);
        }
    }


    // Start all the threads
    for (Iterator i = threads.iterator(); i.hasNext(); ) {
        Thread t = (Thread)i.next();
        t.start();
    }


    // Wait for the last one to finish
    try {
        Thread.sleep(sleepDelay + 1000);
    }
    catch (Exception ex) {
        // ignore
    }

    // Say we're done
    System.out.println("\nEnd");

    }
}

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: regexp-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: regexp-dev-help@jakarta.apache.org


Mime
View raw message