lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <benedetti.ale...@gmail.com>
Subject Re: [Solr 6] Migration from Solr 4.10.2
Date Tue, 24 May 2016 16:43:46 GMT
Update , it seems clear I incurred in the bad
https://issues.apache.org/jira/browse/SOLR-8096 :

Just adding some additional information as I just incurred on the issue
with Solr 6.0 :
Static index, around 50 *10^6 docs, 20 fields to facet, 1 of them with high
cardinality on top of grouping.
Groping was not affecting at all.

All the symptoms are there, Solr 4.10.2 around 150 ms and Solr 6.0 around
550 ms .
The 'fieldValueCache' seems to be unused (no inserts nor lookups) in Solr
6.0.
In Solr 4.10 the 'fieldValueCache' is in heavy use with a
cumulative_hitratio of 0.96 .
Switching from enum to fc to fcs to uif did not change that much.

Moving to DocValues didn't improve that much the situation ( but I was on
an optimized index, so I need to try the multi-segmented one according
to Mikhail
Khludnev
<https://issues.apache.org/jira/secure/ViewProfile.jspa?name=mkhludnev>
contribution
in Solr 5.4.0 ) .

Moving to field collapsing moved down the query to 110-120 ms ( but this is
normal, we were faceting on 260 /1 million orignal docs)
Adding facet.threads=NCores moved down the queryTime to 100 ms, in
combination with field collapsing we reached 80-90 ms when warmed.

What are the plan for the future related this ?
Do we want to deprecate the legacy facets implementation and move
everything to Json facets ( like it happened with the UIF ) ?
So backward compatible but different implementation ?

I think for migrations should be a transparent process.


Cheers

On Mon, May 23, 2016 at 6:49 PM, Alessandro Benedetti <
benedetti.alex85@gmail.com> wrote:

> Furthermore I was checking the internals of the old facet implementation (
> which comes when using the classic request parameter based,  instead of the
> json facet). It seems that if you enable docValues even with the enun
> method passed as parameter , actually fc with docValues will be used.
> i will give some report on the performance we get with docValues.
>
> Cheers
> On 23 May 2016 16:29, "Joel Bernstein" <joelsolr@gmail.com> wrote:
>
>> If you can make min/max work for you instead of sort then it should be
>> faster, but I haven't spent time comparing the performance.
>>
>> But if you're using the top_fc with the min/max param the performance
>> between Solr 4 & Solr 6 should be very close as the data structures behind
>> them are the same.
>>
>>
>>
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Mon, May 23, 2016 at 3:34 PM, Alessandro Benedetti <
>> abenedetti@apache.org
>> > wrote:
>>
>> > Hi Joel,
>> > thanks for the reply, actually we were not using field collapsing
>> before,
>> > we basically want to replace grouping with that.
>> > The grouping performance between Solr 4 and 6 are basically comparable.
>> > It's surprising I got so big degradation with the field collapsing.
>> >
>> > So basically the comparison we did were based on the Solr4 queries ,
>> > extracted from logs, and modified slightly to include field collapsing
>> > parameter.
>> >
>> > To build the tests to compare Solr 4.10.2 to Solr 6 we basically
>> proceeded
>> > in this way :
>> >
>> > 1) install Solr 4.10.2 and Solr 6.0.0
>> > 2) migrate the index with the related lucene tool ( 4.10.2 -> 5.5.0 ->
>> Solr
>> > 6.0 )
>> > 3) switch on/off the 2 instances and repeating the tests both with cold
>> > instances and warm instances.
>> >
>> > This means that the query looks the same.
>> > I have not double checked the results but only the timings.
>> > I will provide additional feedback to see if the query are producing
>> > comparable results as well.
>> >
>> > Related your suggestion about the top_fc, thanks, I will try that .
>> > I actually discovered that a little bit after I posted the mailing list
>> ( I
>> > think exactly from another post of yours :) )
>> >
>> > Not sure if setting up docValues for the field we use to collapse could
>> > give some benefit as well.
>> >
>> > I keep you updated,
>> >
>> > Cheers
>> >
>> > On Mon, May 23, 2016 at 2:48 PM, Joel Bernstein <joelsolr@gmail.com>
>> > wrote:
>> >
>> > > Were you using the sort param or min/max param in Solr 4 to select the
>> > > group head? The sort work came later and I'm not sure how it compares
>> in
>> > > performance to the min/max param.
>> > >
>> > > Since you are collapsing on a string field you can use the top_fc hint
>> > > which will use a top level field cache for the collapse. This is
>> faster
>> > at
>> > > query time then the default which uses MultiDocValue ordinal map.
>> > >
>> > > The docs cover the top_fc hint.
>> > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results
>> > >
>> > >
>> > >
>> > > Joel Bernstein
>> > > http://joelsolr.blogspot.com/
>> > >
>> > > On Mon, May 23, 2016 at 12:14 PM, Alessandro Benedetti <
>> > > abenedetti@apache.org> wrote:
>> > >
>> > > > Let's add some additional details guys :
>> > > >
>> > > > 1) *Faceting*
>> > > > Currently the facet method used is "enum" and it runs over 20 fields
>> > more
>> > > > or less.
>> > > > Mainly using it on low cardinality fields except one which has a
>> > > > cardinality of 1000 terms.
>> > > > I am aware of the famous Jira related faceting regression :
>> > > > https://issues.apache.org/jira/browse/SOLR-8096 .
>> > > >
>> > > > Our index is indeed quite static ( we index once per day) and the
>> > fields
>> > > we
>> > > > facet on are multi-valued ( by schema definition but not in
>> practise) .
>> > > > But we use Term Enum as method so i was not expecting to hit the
>> > > > regression.
>> > > > We currently see  query times which are 30% worse than Solr 4.10.2
.
>> > > > Our next experiment will be to enable docValues for all the fields
>> and
>> > > > verify if we get any benefit ( switching the facet method to fc) .
>> > > > At the moment, switching to json faceting is not an option as we
>> would
>> > > like
>> > > > first to proceed with a transparent migration and then possibly add
>> > > > improvements and refactor in the future.
>> > > > Following will be to fix the schema to set as multi valued only
>> what is
>> > > > really multi-valued ( do you know if this can affect ? the wrong
>> schema
>> > > > definition is enough to mess up the facet performance ? even if then
>> > the
>> > > > fields are single valued ?)
>> > > >
>> > > >
>> > > > 2) *Field Collapsing*
>> > > > Field collapsing performance seems much, much worse, something like
>> 200
>> > > ms
>> > > > ( Solr 4) vs 1800 ms ( Solr 6) .
>> > > > This is suprising as I never heard about any regression in field
>> > > > collapsing.
>> > > > I will investigate a little bit more in details about the internals
>> of
>> > > the
>> > > > field collapsing and why the performance could be so degraded.
>> > > > I will also verify if I find any info in the mailing list or Jira.
>> > > >
>> > > > &fq={!collapse field=string_field sort='TrieDoubleField asc'}
>> > > >
>> > > > let me know if you faced something similar
>> > > >
>> > > > Cheers
>> > > >
>> > > > On Fri, May 13, 2016 at 10:41 PM, Alessandro Benedetti <
>> > > > abenedetti@apache.org> wrote:
>> > > >
>> > > > > I'm planning a migration from 4.10.2 to 6.0 .
>> > > > > Because we generate the index on daily basis from scratch, we
>> don't
>> > > need
>> > > > > to migrate the index but actually only migrate the server
>> instances.
>> > > > > With my team we were doing some experiments on some dev machines,
>> > > > > basically comparing Solr 4.10.2 and Solr 6.0 to check any
>> functional
>> > > and
>> > > > > performance regression in our use cases.
>> > > > >
>> > > > > After setting up two installation on the same machine ( switching
>> on
>> > > and
>> > > > > off each version for doing comparison and experiments) we are
>> > > verifying a
>> > > > > degradation of the performances with Solr 6.
>> > > > >
>> > > > > Basically from a queryTime and throughput perspective Solr 6
is
>> not
>> > > > > performing as well as Solr 4.10.2 .
>> > > > > Still need to start the proper investigations but this appears
>> weird
>> > to
>> > > > me.
>> > > > > Will proceed with all the analysis of the case and a deep study
of
>> > our
>> > > > > queries ( which anyway are mainly fq , faceting and grouping).
>> > > > >
>> > > > > Any suggestion in particular to start with ? Has anyone
>> experienced a
>> > > > > similar migration with similar experience ?
>> > > > > I will anyway explore also the mailing list in search for similar
>> > > cases.
>> > > > >
>> > > > > Cheers
>> > > > >
>> > > > > --
>> > > > > --------------------------
>> > > > >
>> > > > > Benedetti Alessandro
>> > > > > Visiting card : http://about.me/alessandro_benedetti
>> > > > >
>> > > > > "Tyger, tyger burning bright
>> > > > > In the forests of the night,
>> > > > > What immortal hand or eye
>> > > > > Could frame thy fearful symmetry?"
>> > > > >
>> > > > > William Blake - Songs of Experience -1794 England
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > --------------------------
>> > > >
>> > > > Benedetti Alessandro
>> > > > Visiting card : http://about.me/alessandro_benedetti
>> > > >
>> > > > "Tyger, tyger burning bright
>> > > > In the forests of the night,
>> > > > What immortal hand or eye
>> > > > Could frame thy fearful symmetry?"
>> > > >
>> > > > William Blake - Songs of Experience -1794 England
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > --------------------------
>> >
>> > Benedetti Alessandro
>> > Visiting card : http://about.me/alessandro_benedetti
>> >
>> > "Tyger, tyger burning bright
>> > In the forests of the night,
>> > What immortal hand or eye
>> > Could frame thy fearful symmetry?"
>> >
>> > William Blake - Songs of Experience -1794 England
>> >
>>
>


-- 
--------------------------

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message