lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Phrase Fields performance
Date Tue, 04 Apr 2017 18:28:14 GMT
bq: ...and reducing the boost values to much smaller numbers...

not sure why that would matter for performance, muliplying is
multiplying, although reducing the boost on the default field might
have added up to a _lot_ of math ops.

Or is the boosting just a way to change the ranking to something you
can live with and not really a comment on performance?

I suspect the big difference is reducing the number of fields in "qf"
and my guess would be that the two fields omitted are larger text
fields.

FWIW,
Erick

On Tue, Apr 4, 2017 at 8:36 AM, David Hastings
<hastings.recursive@gmail.com> wrote:
> FYI, think i managed to get the results back and the speeds that i desired
> back reducing the number of fields in the qf/pf values from 6 to 4, also
> making sure to not boost the default field, and reducing the boost values
> to much smaller numbers but still significant enough to boost properly, so
> went from around .3 seconds pre qf/pf, above 1 sec after agressive
> settings, and now back down to around half a second with modified values,
> which I can live with.   also if anyone else like myself stores qtimes in a
> table this is a good 15 minute rolling average sql query you may or may not
> find useful:
>
>
> SELECT when_done as timestamp, AVG( qtime ), count(id)  FROM qtimes WHERE
>  `when_done` >=  '2017-03-23 09:00:00' AND `when_done` <=  '2017-03-23
> 13:00:00' GROUP BY year(when_done),month(when_done),day(when_done),( 4 *
> HOUR( when_done ) + FLOOR( MINUTE( when_done ) / 15 ))  ORDER BY
>  `qtimes`.`when_done` ASC;
>
>
>
>
>
> pre qf/pf values:
> | timestamp           | AVG( qtime ) | count(id) |
> +---------------------+--------------+-----------+
> | 2017-03-23 09:00:00 |     322.0585 |       581 |
> | 2017-03-23 09:15:01 |     243.9634 |       628 |
> | 2017-03-23 09:30:00 |     347.1856 |       652 |
> | 2017-03-23 09:45:03 |     407.3195 |       673 |
> | 2017-03-23 10:00:02 |     307.1313 |       678 |
> | 2017-03-23 10:15:00 |     266.9802 |       759 |
> | 2017-03-23 10:30:01 |     288.1789 |       833 |
> | 2017-03-23 10:45:01 |     275.0880 |       852 |
> | 2017-03-23 11:00:02 |     417.0151 |       861 |
> | 2017-03-23 11:15:01 |     267.1153 |       945 |
> | 2017-03-23 11:30:00 |     387.1656 |       803 |
> | 2017-03-23 11:45:00 |     268.5137 |       837 |
> | 2017-03-23 12:00:00 |     294.5911 |       807 |
> | 2017-03-23 12:15:00 |     411.8617 |       752 |
> | 2017-03-23 12:30:00 |     478.3566 |       788 |
> | 2017-03-23 12:45:01 |     262.2294 |       680 |
>
>
>
> after pf/qf values but too agressive:
>
> | timestamp           | AVG( qtime ) | count(id) |
> +---------------------+--------------+-----------+
> | 2017-04-03 09:00:04 |    1002.1900 |       600 |
> | 2017-04-03 09:15:04 |     873.2367 |       659 |
> | 2017-04-03 09:30:00 |    1013.9041 |       563 |
> | 2017-04-03 09:45:01 |    1256.8596 |       591 |
> | 2017-04-03 10:00:08 |    1092.8582 |       663 |
> | 2017-04-03 10:15:00 |    1322.4262 |       671 |
> | 2017-04-03 10:30:06 |     848.1130 |       770 |
> | 2017-04-03 10:45:00 |    1039.3202 |       887 |
> | 2017-04-03 11:00:00 |    1144.9216 |       536 |
> | 2017-04-03 11:15:02 |     620.8999 |       719 |
> | 2017-04-03 11:30:03 |     999.7113 |       665 |
> | 2017-04-03 11:45:00 |    1144.1348 |       564 |
> | 2017-04-03 12:00:01 |    1317.7461 |       453 |
> | 2017-04-03 12:15:02 |    1413.5864 |       573 |
> | 2017-04-03 12:30:02 |     746.9422 |       623 |
> | 2017-04-03 12:45:00 |    1088.4789 |       568 |
>
>
> and finally modified pf/qf values changed at exactly 1046 am today:
>
>
> +---------------------+--------------+-----------+
> | timestamp           | AVG( qtime ) | count(id) |
> +---------------------+--------------+-----------+
> | 2017-04-04 09:00:00 |    1079.3983 |       605 |
> | 2017-04-04 09:15:04 |    1190.4540 |       544 |
> | 2017-04-04 09:30:00 |    1459.6425 |       621 |
> | 2017-04-04 09:45:00 |    2074.2777 |       677 |
> | 2017-04-04 10:00:01 |    1555.0798 |       664 |
> | 2017-04-04 10:15:00 |    1313.1793 |       697 |
> | 2017-04-04 10:30:00 |    1042.4969 |       809 |
> | 2017-04-04 10:45:00 |     773.2043 |       695 |
> | 2017-04-04 11:00:00 |     526.7830 |       788 |
> | 2017-04-04 11:15:01 |     470.1969 |       711 |
> | 2017-04-04 11:30:02 |     642.1838 |       136 |
>
>
>
>
> On Sat, Apr 1, 2017 at 11:13 AM, Dave <hastings.recursive@gmail.com> wrote:
>
>> Maybe commongrams could help this but it boils down to
>> speed/quality/cheap. Choose two. Thanks
>>
>> > On Apr 1, 2017, at 10:28 AM, Shawn Heisey <apache@elyograg.org> wrote:
>> >
>> >> On 3/31/2017 1:55 PM, David Hastings wrote:
>> >> So I un-commented out the line, to enable it to go against 6 important
>> >> fields. Afterwards through monitoring performance I noticed that my
>> >> searches were taking roughly 50% to 100% (2x!) longer, and it started
>> >> at the exact time I committed that change, 1:40 pm, qtimes below in a
>> >> 15 minute average cycle with the start time listed.
>> >
>> > That is fully expected.  Using both pf and qf basically has Solr doing
>> > the exact same queries twice, once as specified on fields in qf, then
>> > again as a phrase query on fields in pf.  If you add pf2 and/or pf3, you
>> > can expect further speed drops.
>> >
>> > If you're sorting by relevancy, using pf with higher boosts than qf
>> > generally will make your results better, but it comes at a cost in
>> > performance.
>> >
>> > Thanks,
>> > Shawn
>> >
>>

Mime
View raw message