Return-Path: Delivered-To: apmail-jakarta-oro-user-archive@apache.org Received: (qmail 8982 invoked from network); 9 May 2003 08:07:39 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 9 May 2003 08:07:39 -0000 Received: (qmail 17095 invoked by uid 97); 9 May 2003 08:09:55 -0000 Delivered-To: qmlist-jakarta-archive-oro-user@nagoya.betaversion.org Received: (qmail 17087 invoked from network); 9 May 2003 08:09:54 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 9 May 2003 08:09:54 -0000 Received: (qmail 8496 invoked by uid 500); 9 May 2003 08:07:30 -0000 Mailing-List: contact oro-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "ORO Users List" Reply-To: "ORO Users List" Delivered-To: mailing list oro-user@jakarta.apache.org Received: (qmail 8386 invoked from network); 9 May 2003 08:07:28 -0000 Received: from ms1.impacthosting.com (204.142.85.242) by daedalus.apache.org with SMTP; 9 May 2003 08:07:28 -0000 Received: from htsony (unverified [68.58.107.97]) by ms1.impacthosting.com (Rockliffe SMTPRA 5.2.4) with ESMTP id for ; Fri, 9 May 2003 04:07:43 -0400 From: "ravi" To: "'ORO Users List'" Subject: RE: Performance Date: Fri, 9 May 2003 03:07:35 -0500 Message-ID: <000001c31602$0e3a6700$0b01a8c0@htsony> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.2627 Importance: Normal In-Reply-To: <3EBB4BC4.10206@makmal.net> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N I'm attaching the code that I used. I don't know why it's taking a lot of time. There should be something wrong with my regular expressions or with my code. Can somebody please look at it and let me know what's wrong. You can try any piece of text as input. I would really really appreciate it. Thanks in advance. private static Perl5Compiler compiler; private static PatternMatcher matcher; private static Perl5Substitution substitution; public static void main(String args[]) { compiler = new Perl5Compiler(); matcher = new Perl5Matcher(); substitution = new Perl5Substitution(); String text = args[0]; Pattern pattern = getPattern("\t"); text = replaceText(pattern," ",text); pattern = getPattern("[\\[\\]\\{\\}\\^\\~?!()\";/\\|,<>`]"); text = replaceText(pattern," $& ",text); pattern = getPattern("^('|&)"); text = replaceText(pattern,"$& ",text); pattern = getPattern("([^A-Za-z0-9])('|&|@|%|\\*)"); text = replaceText(pattern,"$& ",text); pattern = getPattern("('|:|-|#|\\*|\\+|\\$|&|@|'S|'D|'M|'LL|'RE|'VE|N'T|'s|'d|'m|' ll|'re|'ve|n't)$"); text = replaceText(pattern," $&",text); pattern = getPattern("('|:|-|#|\\*|\\+|\\$|&|@|'S|'D|'M|'LL|'RE|'VE|N'T|'s|'d|'m|' ll|'re|'ve|n't)([^A-Za-z0-9])"); text = replaceText(pattern," $&",text); StringTokenizer strTok = new StringTokenizer(text); while(strTok.hasMoreTokens()) { String token = strTok.nextToken(); token = token.trim() pattern = getPattern("([A-Za-z0-9][.])$"); if(contains(pattern,token)) { pattern = getPattern("^([A-Za-z]\\.([A-Za-z]\\.)+|[A-Za-z]\\.|[A-Z][bcdfghj-np-tvx z]+\\.)$"); if(contains(pattern,token)) { ///code to process token which does not use any regex stuff } } else { pattern = getPattern("^([A-Za-z0-9])"); if(contains(pattern,token)) { pattern = getPattern("([A-Za-z0-9]+\\.[A-Za-z]+|[0-9]+\\.[A-Za-z])"); if(contains(pattern,token)) { ////code to process token which does not use any regex stuff } else { if(contains(getPattern("^([A-Za-z])"),token)) { //code to process token which does not use any regex stuff } } } else { pattern = getPattern("([.!?])$"); if(contains(pattern,token)) { //code to process token which does not use any regex stuff } } } } } public static boolean contains(Pattern pattern,String str) { return matcher.contains(str,pattern); } public static String replaceText(Pattern pattern,String replacement,String str) { substitution.setSubstitution(replacement); return Util.substitute(matcher,pattern,substitution,str,Util.SUBSTITUTE_ALL); } public static Pattern getPattern(String pattern) { return compiler.compile(pattern); } Thanks, Ravi. --------------------------------------------------------------------- To unsubscribe, e-mail: oro-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: oro-user-help@jakarta.apache.org