Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 63232 invoked from network); 19 Nov 2001 19:07:24 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 19 Nov 2001 19:07:24 -0000 Received: (qmail 6911 invoked by uid 97); 19 Nov 2001 19:06:33 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@jakarta.apache.org Received: (qmail 6874 invoked by uid 97); 19 Nov 2001 19:06:32 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 6838 invoked from network); 19 Nov 2001 19:06:32 -0000 Message-ID: <63E3C2F6A684D311B8200090277BFA8E78A9AE@TETLEY> From: Jordan Naftolin To: "'lucene-user@jakarta.apache.org'" Subject: TokenMgnError Date: Mon, 19 Nov 2001 14:17:23 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C1712E.D167A730" X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N ------_=_NextPart_001_01C1712E.D167A730 Content-Type: text/plain Hi, I am receiving a TokenMsgError when certain characters such as commas, '[', and '<' are being used in a query. I read a message in the archive where someone was experiencing a similar problem, and apparently certain characters have special meanings in the query and the TokenMsgError is thrown when these characters are not used correctly. I am currently taking a query string directly from an input field on a web site, and so I can't ensure that users will write the query correctly. Since I think it is common for a web user to enter a comma into their search query, I am wondering how other people are handling this problem. Has anyone written a tokenizer that can safely read any query from a web user without throwing the error? Or is what I am experiencing potentially a bug? In case it is helpful, the stackTrace being generated when a comma is entered in the search query is: org.apache.lucene.queryParser.TokenMgrError: Lexical error at line 1, column 2. Encountered: after : "" at org.apache.lucene.queryParser.QueryParserTokenManager.getNextToken(QueryPars erTokenManager.java:523) at org.apache.lucene.queryParser.QueryParser.jj_ntk(QueryParser.java:583) at org.apache.lucene.queryParser.QueryParser.Modifiers(QueryParser.java:216) at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:251) at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:72) at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:49) I tried tracking the error down in the QueryParserTokenManager to determine the proper usage, but I was having trouble understanding exactly what the class was doing since it contained a lot of hard-coded hex and weird method names. Any suggestions are greatly appreciated. Thanks, Jordan ------_=_NextPart_001_01C1712E.D167A730--