lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Jelsma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-1632) Distributed IDF
Date Wed, 20 Feb 2013 12:35:15 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582142#comment-13582142
] 

Markus Jelsma commented on SOLR-1632:
-------------------------------------

It doesn't really seem to work, we're seeing lots of NPE's and if a response comes through
IDF is not consistent for all terms. Most request return one of the NPE's below. Sometimes
it works, and then the second request just fails.

{code}
java.lang.NullPointerException
	at org.apache.solr.search.stats.ExactStatsCache.sendGlobalStats(LRUStatsCache.java:202)
	at org.apache.solr.handler.component.QueryComponent.createMainQuery(QueryComponent.java:783)
	at org.apache.solr.handler.component.QueryComponent.regularDistributedProcess(QueryComponent.java:618)
	at...
{code}

{code}
java.lang.NullPointerException
	at org.apache.solr.search.stats.LRUStatsCache.sendGlobalStats(LRUStatsCache.java:228)
	at org.apache.solr.handler.component.QueryComponent.createMainQuery(QueryComponent.java:783)
	at org.apache.solr.handler.component.QueryComponent.regularDistributedProcess(QueryComponent.java:618)
	at...
{code}

We also see this one from time to time, it looks like this is thrown is there are `no servers
hosting shard`:
{code}
java.lang.NullPointerException
	at org.apache.solr.search.stats.LRUStatsCache.mergeToGlobalStats(LRUStatsCache.java:112)
	at org.apache.solr.handler.component.QueryComponent.updateStats(QueryComponent.java:743)
	at org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:659)
	at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:634)
	at ..
{code}

It's also imposes a huge performance penalty with both LRUStatsCache and ExactStatsCache,
if you're used to 40ms response times you'll see the average jump to 2 seconds with very frequent
5 second spikes. Performance stays poor if logging is disabled.

The logs are also swamped with logs like:
{code}
2013-02-20 11:54:48,091 WARN [search.stats.LRUStatsCache] - [http-8080-exec-5] - : ## Missing
global colStats info: <FIELD>, using local
2013-02-20 11:54:48,091 WARN [search.stats.LRUStatsCache] - [http-8080-exec-5] - : ## Missing
global termStats info: <FIELD>:<TERM>, using local
{code}

Both StatsCacheImpls behave like this. Each query logs lines like above. Maybe performance
is poor because it tries to look up terms everytime but i'm not sure yet.


Finally something crazy i'd like to share :)
{code}
-Infinity = (MATCH) sum of:
  -Infinity = (MATCH) max plus 0.35 times others of:
    -Infinity = (MATCH) weight(content_nl:amsterdam^1.6 in 449) [], result of:
      -Infinity = score(doc=449,freq=1.0 = termFreq=1.0
), product of:
        1.6 = boost
        -Infinity = idf(docFreq=29800090, docCount=-1)
        1.0 = tfNorm, computed from:
          1.0 = termFreq=1.0
          1.2 = parameter k1
          0.0 = parameter b (norms omitted for field)
{code}

If someone happens to recognize the issues above, i'm all ears :)
                
> Distributed IDF
> ---------------
>
>                 Key: SOLR-1632
>                 URL: https://issues.apache.org/jira/browse/SOLR-1632
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.5
>            Reporter: Andrzej Bialecki 
>             Fix For: 5.0
>
>         Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, distrib.patch, SOLR-1632.patch,
SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch
>
>
> Distributed IDF is a valuable enhancement for distributed search across non-uniform shards.
This issue tracks the proposed implementation of an API to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message