lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomás Fernández Löbbe <tomasflo...@gmail.com>
Subject Re: Slow faceting performance on a docValues field
Date Tue, 13 Jan 2015 19:12:17 GMT
No, you are not misreading, right now there is no automatic way of
generating the intervals on the server side similar to range faceting... I
guess it won't work in your case. Maybe you should create a Jira to add
this feature to interval faceting.

Tomás

On Tue, Jan 13, 2015 at 10:44 AM, David Smith <dsmithsolr@yahoo.com.invalid>
wrote:

> Tomás,
>
>
> Thanks for the response -- the performance of my query makes perfect sense
> in light of your information.
> I looked at Interval faceting.  My required interval is 1 day.  I cannot
> change that requirement.  Unless I am mis-reading the doc, that means to
> facet a 10 year range, the query needs to specify over 3,600 intervals ??
>
>
> f.eventDate.facet.interval.set=[2005-01-01T00:00:00.000Z,2005-01-01T23:59:59.999Z]&f.eventDate.facet.interval.set=[2005-01-02T00:00:00.000Z,2005-01-02T23:59:59.999Z]&etc,etc
>
>
> Each query would be 185MB in size if I structure it this way.
>
> I assume I must be mis-understanding how to use Interval faceting with
> dates.  Are there any concrete examples you know of?  A google search did
> not come up with much.
>
> Kind regards,
> Dave
>
>      On Tuesday, January 13, 2015 12:16 PM, Tomás Fernández Löbbe <
> tomasflobbe@gmail.com> wrote:
>
>
>  Range Faceting won't use the DocValues even if they are there set, it
> translates each gap to a filter. This means that it will end up using the
> FilterCache, which should cause faster followup queries if you repeat the
> same gaps (and don't commit).
> You may also want to try interval faceting, it will use DocValues instead
> of filters. The API is different, you'll have to provide the intervals
> yourself.
>
> Tomás
>
> On Tue, Jan 13, 2015 at 10:01 AM, Shawn Heisey <apache@elyograg.org>
> wrote:
>
> > On 1/13/2015 10:35 AM, David Smith wrote:
> > > I have a query against a single 50M doc index (175GB) using Solr
> 4.10.2,
> > that exhibits the following response times (via the debugQuery option in
> > Solr Admin):
> > > "process": {
> > >  "time": 24709,
> > >  "query": { "time": 54 }, "facet": { "time": 24574 },
> > >
> > >
> > > The query time of 54ms is great and exactly as expected -- this example
> > was a single-term search that returned 3 hits.
> > > I am trying to get the facet time (24.5 seconds) to be sub-second, and
> > am having no luck.  The facet part of the query is as follows:
> > >
> > > "params": { "facet.range": "eventDate",
> > >  "f.eventDate.facet.range.end": "2015-05-13T16:37:18.000Z",
> > >  "f.eventDate.facet.range.gap": "+1DAY",
> > >  "start": "0",
> > >
> > >  "rows": "10",
> > >
> > >  "f.eventDate.facet.range.start": "2005-03-13T16:37:18.000Z",
> > >
> > >  "f.eventDate.facet.mincount": "1",
> > >
> > >  "facet": "true",
> > >
> > >  "debugQuery": "true",
> > >  "_": "1421169383802"
> > >  }
> > >
> > > And, the relevant schema definition is as follows:
> > >
> > >    <field name="eventDate" type="tdate" indexed="true" stored="true"
> > multiValued="false" docValues="true"/>
> > >
> > >    <!-- A Trie based date field for faster date range queries and date
> > faceting. -->
> > >    <fieldType name="tdate" class="solr.TrieDateField" precisionStep="6"
> > positionIncrementGap="0"/>
> > >
> > >
> > > During the 25-second query, the Solr JVM pegs one CPU, with little or
> no
> > I/O activity detected on the drive that holds the 175GB index.  I have
> 48GB
> > of RAM, 1/2 of that dedicated to the OS and the other to the Solr JVM.
> > >
> > > I do NOT have any fieldValue caches configured as yet, because my
> > (perhaps too simplistic?) reading of the documentation was that DocValues
> > eliminates the need for a field-level cache on this facet field.
> >
> > 24GB of RAM to cache 175GB is probably not enough in the general case,
> > but if you're seeing very little disk I/O activity for this query, then
> > we'll leave that alone and you can worry about it later.
> >
> > What I would try immediately is setting the facet.method parameter to
> > enum and seeing what that does to the facet time.  I've had good luck
> > generally with that, even in situations where the docs indicated that
> > the default (fc) was supposed to work better.  I have never explored the
> > relationship between facet.method and docValues, though.
> >
> > I'm out of ideas after this.  I don't have enough experience with
> > faceting to help much.
> >
> > Thanks,
> > Shawn
> >
> >
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message