Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DB1A5F7E9 for ; Sun, 24 Mar 2013 14:57:35 +0000 (UTC) Received: (qmail 78122 invoked by uid 500); 24 Mar 2013 14:57:33 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 78052 invoked by uid 500); 24 Mar 2013 14:57:33 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 78043 invoked by uid 99); 24 Mar 2013 14:57:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 24 Mar 2013 14:57:32 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_SOFTFAIL X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of adam@labkey.com does not designate 66.111.4.25 as permitted sender) Received: from [66.111.4.25] (HELO out1-smtp.messagingengine.com) (66.111.4.25) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 24 Mar 2013 14:57:26 +0000 Received: from compute6.internal (compute6.nyi.mail.srv.osa [10.202.2.46]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id A3F5D20A6A for ; Sun, 24 Mar 2013 10:57:05 -0400 (EDT) Received: from frontend1.nyi.mail.srv.osa ([10.202.2.160]) by compute6.internal (MEProxy); Sun, 24 Mar 2013 10:57:05 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=from:to:subject:date:message-id :mime-version:content-type; s=smtpout; bh=Mgv9J+fZrI4q7f4tWmZzln Hfv4o=; b=uZ45l7RlTefDFQvSKU5LD52onZuzt2i1bl+72aR1eqzOEX4oW/pESd jDCzKP/V3cZ/MBp9qDexBkPzzjmgRgiej5Vhy2vUvxPda9eu9IDvmWKKA/nbRvKE ad/h1/kOXY1ieEFztSWv2qrH2FBKzkzpPLJVCYpZq6bAqF2QGw3SQ= X-Sasl-enc: P5PTzjhqgOHJTFT+eJqQiXbE2n54Zg1ESqvy/jb4SEEk 1364137024 Received: from AdamE6420 (unknown [76.121.169.15]) by mail.messagingengine.com (Postfix) with ESMTPA id CDEFEC8000B for ; Sun, 24 Mar 2013 10:57:04 -0400 (EDT) From: "Adam Rauch" To: Subject: Assert / NPE using MultiFieldQueryParser Date: Sun, 24 Mar 2013 07:57:03 -0700 Message-ID: <02cd01ce289f$d8ac6190$8a0524b0$@labkey.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_02CE_01CE2865.2C4F1030" X-Mailer: Microsoft Outlook 14.0 Thread-Index: Ac4onMfOnW61AXe+S6SR/lg3qRY1fg== Content-Language: en-us X-Virus-Checked: Checked by ClamAV on apache.org ------=_NextPart_000_02CE_01CE2865.2C4F1030 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit I'm using MultiFieldQueryParser to parse search queries. I find that certain query strings (e.g., "/study/" without the quotes) cause MultiFieldQueryParser.parse() to throw an AssertionError, if asserts are enabled. In production, parse() returns a Query, but it seems to be corrupt. using it to search my index results in an NPE. This seems related to regular expressions. That query string is probably invalid regex syntax. but shouldn't MultiFieldQueryParser to throw a ParseException in this case? Here's a simple example that reproduces the assertion: // Turn on asserts ClassLoader loader = ClassLoader.getSystemClassLoader(); loader.setDefaultAssertionStatus(true); try { Analyzer analyzer = new WhitespaceAnalyzer(Version.LUCENE_41); QueryParser parser = new MultiFieldQueryParser(Version.LUCENE_41, new String[]{"title", "body"}, analyzer); Query query = parser.parse("/study/"); } catch (ParseException e) { System.out.println("Syntax error, please rephrase your query"); } This produces: Exception in thread "main" java.lang.AssertionError at org.apache.lucene.search.MultiTermQuery.(MultiTermQuery.java:252) at org.apache.lucene.search.AutomatonQuery.(AutomatonQuery.java:65) at org.apache.lucene.search.RegexpQuery.(RegexpQuery.java:90) at org.apache.lucene.search.RegexpQuery.(RegexpQuery.java:79) at org.apache.lucene.search.RegexpQuery.(RegexpQuery.java:69) at org.apache.lucene.queryparser.classic.QueryParserBase.newRegexpQuery(QueryPa rserBase.java:790) at org.apache.lucene.queryparser.classic.QueryParserBase.getRegexpQuery(QueryPa rserBase.java:1005) at org.apache.lucene.queryparser.classic.QueryParserBase.handleBareTokenQuery(Q ueryParserBase.java:1075) at org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:359) at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:25 8) at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:182 ) at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser. java:171) at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase. java:120) at QueryParserException.main(QueryParserException.java:21) Turn off the asserts and parse() returns "successfully". but subsequent use of that Query instance results in NPEs such as: java.lang.NullPointerException at java.util.TreeMap.getEntry(TreeMap.java:342) at java.util.TreeMap.get(TreeMap.java:273) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.terms( PerFieldPostingsFormat.java:215) at org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRe write.java:58) at org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoR ewrite.java:95) at org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(Mul tiTermQuery.java:220) at org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:286) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:429) at org.apache.lucene.search.FilteredQuery.rewrite(FilteredQuery.java:334) at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:616) at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher. java:663) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281) at org.labkey.search.model.LuceneSearchServiceImpl.search(LuceneSearchServiceIm pl.java:1160) This is appearing on production deployments with reasonable (from a user's perspective) search queries (e.g., "http://labkey.org/study/xml" without the quotes). I'd like to either turn off regex parsing altogether or detect the syntax error at parse time so I can provide my standard syntax guidance back to the user. Thanks, Adam ------=_NextPart_000_02CE_01CE2865.2C4F1030--