lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-1931) Schema Browser does not scale with large indexes
Date Tue, 03 Jan 2012 02:40:21 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178602#comment-13178602
] 

Robert Muir commented on SOLR-1931:
-----------------------------------

why is it still 39seconds? shouldn't tools like this just use statistics and not enumerate
terms or any anything else by default so that they return instantly?

its 4.0, why not just backwards break and make it fast?

Instead of doing enumerations and stuff, you could display all of the Terms-level statistics
per segment per field: 
* uniqueTermCount (# of terms)
* sumDocFreq (# of postings/term-doc mappings)
* sumTotalTermFreq (# of positions/tokens)
* docCount (# of documents with at least one posting for the field)

This would all be basically instantaneous and would give a more thorough picture of the performance
characteristics of the index (e.g. how many positions).
You could also compute derived stats like average field length etc too.
                
> Schema Browser does not scale with large indexes
> ------------------------------------------------
>
>                 Key: SOLR-1931
>                 URL: https://issues.apache.org/jira/browse/SOLR-1931
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 3.6, 4.0
>            Reporter: Lance Norskog
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: SOLR-1931-3x.patch, SOLR-1931-3x.patch, SOLR-1931-trunk.patch, SOLR-1931-trunk.patch
>
>
> The Schema  Browser JSP by default causes the Luke handler to "scan the world". In large
indexes this make the UI useless.
> On an index with 64m documents & 8gb of disk space, the Schema Browser took 6 minutes
to open and hogged all disk I/O, making Solr useless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message