lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-1931) Schema Browser does not scale with large indexes
Date Wed, 28 Dec 2011 20:22:32 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176796#comment-13176796
] 

Erick Erickson commented on SOLR-1931:
--------------------------------------

In the trunk (4.x) version, (from Muir) below. I haven't looked at this yet, but being able
to get some approximation back from Luke quickly would be a big help. Maybe we can make this
happen on trunk?

The use-case I'm interested in is the one in which we're really only looking for outrageous
numbers of unique terms. Having unique terms per segment would go a long way towards that
use-case.

*******
Is it really necessary to see the 'top level' number of distinct terms
summed across all segments?
Maybe its good enough to list the information on a per-segment basis.
Then it would always be instant-fast:

you would just use FieldsEnum api to list all the fields, and for each
field call .terms() and then Terms.getUniqueTermCount()

Note: getUniqueTermCount won't work (returns -1) for any 3.x segments
that haven't yet been upgraded to the 4.0 format.
The old 3.x format only allows you to get the uniqueTermCount across
all fields in the segment (Fields.getUniqueTermCount), because fields
are not clearly separated.
******** 
                
> Schema Browser does not scale with large indexes
> ------------------------------------------------
>
>                 Key: SOLR-1931
>                 URL: https://issues.apache.org/jira/browse/SOLR-1931
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 1.4
>            Reporter: Lance Norskog
>            Priority: Minor
>
> The Schema  Browser JSP by default causes the Luke handler to "scan the world". In large
indexes this make the UI useless.
> On an index with 64m documents & 8gb of disk space, the Schema Browser took 6 minutes
to open and hogged all disk I/O, making Solr useless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message