lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-3094) The statistics entry on the new admin UI is very slow
Date Sat, 04 Feb 2012 00:15:53 GMT

     [ https://issues.apache.org/jira/browse/SOLR-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Erick Erickson updated SOLR-3094:
---------------------------------

    Attachment: SOLR-3094.patch

OK, anyone with good javascript skills, this would be a good time to chime in...

This is a variant of SOLR-1931. The new UI calls Luke at the top level in such a way that
it enumerates all the terms in all the fields to gather the histogram data, which takes a
long time. Note, this is what the old admin UI/Luke handler did when you clicked "schema browser"
link.

Once that data is accumulated, then clicking on the individual fields and showing that data
is very fast since the data is local. But this data is accumulated *before* any field is selected
from the "schema browser" drop-down and stored away.

I think this design is too costly, especially the "get all the data for all the fields up-front"
bit. The users pay a penalty (many minutes demonstrated) even when they may only care about
one field. So here's what I propose.

1> Tweak the LukeRequestHandler so it *requires* the fieldName parameter to gather the
historgram data. That fixes the initial display of the stats issue that sparked this JIRA.
I can do that in a few minutes, patch attached (do not commit yet, though). Problem is there
is then no way at all to get the stats data.

2> Tweak the javascript to call the luke request handler to collect the data for individual
fields only when the user selects them from the drop-down, stowing them away at that point
so they can be revisited if desired. Here's where I could use some help, my javascript skills
are rudimentary at best. If anyone could work the javascript I'd be happy to field test. Or
even just put some comments in the code pointing me to them. Any trunk code from after 6-Jan
will have the right Luke handler in it (see SOLR-1931).

There's also something wrong with the display of the histogram, the "bucket" and count in
each bucket are mashed together on the bottom. With non-trivial indexes, this is largely unreadable
since they're side-by-side...

Anyway, the attached patch makes it so you can get into the admin page without paying the
above penalties, but you *never* get histogram data when you go into "schema browser". If
someone applies this to work on the admin UI bit, attaching "&fl=field1 field2" to the
luke URL will cause the histogram data to be returned for the field(s) specified.

If anyone has some spare cycles to help out here it would be outstanding.

I think something similar could be done for the old admin UI as well in terms of only getting
the fields when requested, otherwise the histogram data won't be returned either...
                
> The statistics entry on the new admin UI is very slow
> -----------------------------------------------------
>
>                 Key: SOLR-3094
>                 URL: https://issues.apache.org/jira/browse/SOLR-3094
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 4.0
>         Environment: trunk only, all environments
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>         Attachments: SOLR-3094.patch
>
>
> Prompted by Robert Reynolds (SOLR-2667), the entry point in the new Admin UI core drill
down (e.g. clicking "singlecore" takes a long time. 28-46 *minutes* on a 13M-23M doc set.
> On an example Wikipedia index (11M) docs, I see 21 seconds, compared to less than 2 seconds
in the old admin UI (I'm using the old admin UI linked to from the new UI page on trunk).
I have a very simple index layout compared to a commercial site. Clearly something is not
right. I suspect that all the terms are being walked.
> This is particularly an issue because this behavior happens when I click "singlecore",
so getting to the really neat parts of the new UI is hard.
> Robert reports on a separate thread that the same behavior happens just hitting admin/luke
in the URL which is also slow in the 3.x world, which hints at where the problem lies.
> I'm going to guess that the terms are being walked and we can use the tricks used in
SOLR-1931 to deal with the fact that admin/luke takes a long time, and just change the call
to the entry ("singlecore") for this issue.
> Robert: Thanks for pointing this out!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message