Return-Path: Delivered-To: apmail-lucene-solr-dev-archive@minotaur.apache.org Received: (qmail 38390 invoked from network); 7 Sep 2009 08:34:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Sep 2009 08:34:20 -0000 Received: (qmail 74339 invoked by uid 500); 7 Sep 2009 08:34:20 -0000 Delivered-To: apmail-lucene-solr-dev-archive@lucene.apache.org Received: (qmail 74236 invoked by uid 500); 7 Sep 2009 08:34:19 -0000 Mailing-List: contact solr-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-dev@lucene.apache.org Delivered-To: mailing list solr-dev@lucene.apache.org Received: (qmail 74226 invoked by uid 99); 7 Sep 2009 08:34:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Sep 2009 08:34:19 +0000 X-ASF-Spam-Status: No, hits=-1998.5 required=10.0 tests=ALL_TRUSTED,WEIRD_PORT X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Sep 2009 08:34:17 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 7428E234C044 for ; Mon, 7 Sep 2009 01:33:57 -0700 (PDT) Message-ID: <1772628802.1252312437461.JavaMail.jira@brutus> Date: Mon, 7 Sep 2009 01:33:57 -0700 (PDT) From: "Anders Melchiorsen (JIRA)" To: solr-dev@lucene.apache.org Subject: [jira] Commented: (SOLR-1404) Random failures with highlighting In-Reply-To: <699206647.1251906452797.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/SOLR-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752044#action_12752044 ] Anders Melchiorsen commented on SOLR-1404: ------------------------------------------ Hi Igor, thanks for the patch. It does seem to work for me. I will leave it for others to decide whether it is the best fix. If the issue is not fixed at a lower layer, note that the HTMLStripStandardTokenizerFactory seems to have a similar problem. I reported that this problem exists with other tokenizers as well, including the HTMLStripCharFilterFactory+WhitespaceTokenizerFactory combo that you recommend. Today, however, I cannot reproduce that behaviour. As I have been reporting several issues, I find it likely that I have been confused by having multiple configurations running at the same time. > Random failures with highlighting > --------------------------------- > > Key: SOLR-1404 > URL: https://issues.apache.org/jira/browse/SOLR-1404 > Project: Solr > Issue Type: Bug > Components: Analysis, highlighter > Affects Versions: 1.4 > Reporter: Anders Melchiorsen > Fix For: 1.4 > > Attachments: SOLR-1404.patch > > > With a recent Solr nightly, we started getting errors when highlighting. > I have not been able to reduce our real setup to a minimal one that is failing, but the same error seems to pop up with the configuration below. Note that the QUERY will mostly fail, but it will work sometimes. Notably, after running "java -jar start.jar", the QUERY will work the first time, but then start failing for a while. Seems that something is not being reset properly. > The example uses the deprecated HTMLStripWhitespaceTokenizerFactory but the problem apparently also exists with other tokenizers; I was just unable to create a minimal example with other configurations. > SCHEMA > > > > > > > > > > > > > > > id > > INDEX > URL=http://localhost:8983/solr/update > curl $URL --data-binary '1test' -H 'Content-type:text/xml; charset=utf-8' > curl $URL --data-binary '' -H 'Content-type:text/xml; charset=utf-8' > QUERY > curl 'http://localhost:8983/solr/select/?hl.fl=test&hl=true&q=id:1' > ERROR > org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test exceeds length of provided text sized 4 > org.apache.solr.common.SolrException: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test exceeds length of provided text sized 4 > at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:328) > at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89) > at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) > at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299) > at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) > at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) > at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) > at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) > at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) > at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) > at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) > at org.mortbay.jetty.Server.handle(Server.java:285) > at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) > at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) > at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) > at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) > Caused by: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test exceeds length of provided text sized 4 > at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:254) > at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:321) > ... 23 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.