lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From RAUNAK AGRAWAL <agrawal.rau...@gmail.com>
Subject Re: Solr Aggregation queries are way slower than Elastic Search
Date Tue, 12 Dec 2017 14:17:00 GMT
Hi Yonik,

So if the query is fine then I guess even using JSON Facet API will not
help me here. Can you suggest me some other idea or further tuning which
will help me in reducing the latency?

On Tue, Dec 12, 2017 at 7:27 PM, Yonik Seeley <yseeley@gmail.com> wrote:

> OK great, so it's definitely not the main query (which is just a
> single term query in this example!)
>
> > Also I have looked into the JSON Facet API. If I have to use facets, I
> will
> > have to then define 3600 facets in a single query and I guess that would
> be
> > also slow.
>
> You can ask for any number of stats for a given facet (even the root
> facet bucket w/o faceting on any fields):
>
> cutl 'http://localhost:8983/solr/collection1.query?q=variable1:
> 290&rows=0&json.facet={
>   s1:"sum(metric_1)",
>   s2:"sum(metric_2)",
>   s3:"sum(metric_3)"
> }'
>
> -Yonik
>
>
> On Tue, Dec 12, 2017 at 5:40 AM, RAUNAK AGRAWAL
> <agrawal.raunak@gmail.com> wrote:
> > Hi Yonik,
> >
> > As you asked here is the code snippet and the actual solr query. Please
> > have a look. I have included only 104 metrics but like this we can go
> upto
> > 3600 rollups.
> >
> > Also I have looked into the JSON Facet API. If I have to use facets, I
> will
> > have to then define 3600 facets in a single query and I guess that would
> be
> > also slow. Also is there any max limit on the number of facets we can
> > define in a single query?
> >
> > Code snippet:
> >
> > private SolrQuery buildQuery(Integer variable1, List<String> metrics) {
> >     SolrQuery query = new SolrQuery();
> >     query.set("q", "variable1:" + variable1);
> >     query.setRows(0);
> >     metrics.forEach(
> >             metric -> query.setGetFieldStatistics("{!sum=true }" +
> metric)
> >     );
> >     return query;
> > }
> >
> >
> > The generated query:
> >
> > {! q=variable1:290 rows=0 stats=true stats.field='{!sum=true
> > }metric_1' stats.field='{!sum=true }metric_2' stats.field='{!sum=true
> > }metric_3' stats.field='{!sum=true }metric_4' stats.field='{!sum=true
> > }metric_5' stats.field='{!sum=true }metric_6' stats.field='{!sum=true
> > }metric_7' stats.field='{!sum=true }metric_8' stats.field='{!sum=true
> > }metric_9' stats.field='{!sum=true }metric_10' stats.field='{!sum=true
> > }metric_11' stats.field='{!sum=true }metric_12'
> > stats.field='{!sum=true }metric_13' stats.field='{!sum=true
> > }metric_14' stats.field='{!sum=true }metric_15'
> > stats.field='{!sum=true }metric_16' stats.field='{!sum=true
> > }metric_17' stats.field='{!sum=true }metric_18'
> > stats.field='{!sum=true }metric_19' stats.field='{!sum=true
> > }metric_20' stats.field='{!sum=true }metric_21'
> > stats.field='{!sum=true }metric_22' stats.field='{!sum=true
> > }metric_23' stats.field='{!sum=true }metric_24'
> > stats.field='{!sum=true }metric_25' stats.field='{!sum=true
> > }metric_26' stats.field='{!sum=true }metric_27'
> > stats.field='{!sum=true }metric_28' stats.field='{!sum=true
> > }metric_29' stats.field='{!sum=true }metric_30'
> > stats.field='{!sum=true }metric_31' stats.field='{!sum=true
> > }metric_32' stats.field='{!sum=true }metric_33'
> > stats.field='{!sum=true }metric_34' stats.field='{!sum=true
> > }metric_35' stats.field='{!sum=true }metric_36'
> > stats.field='{!sum=true }metric_37' stats.field='{!sum=true
> > }metric_38' stats.field='{!sum=true }metric_39'
> > stats.field='{!sum=true }metric_40' stats.field='{!sum=true
> > }metric_41' stats.field='{!sum=true }metric_42'
> > stats.field='{!sum=true }metric_43' stats.field='{!sum=true
> > }metric_44' stats.field='{!sum=true }metric_45'
> > stats.field='{!sum=true }metric_46' stats.field='{!sum=true
> > }metric_47' stats.field='{!sum=true }metric_48'
> > stats.field='{!sum=true }metric_49' stats.field='{!sum=true
> > }metric_50' stats.field='{!sum=true }metric_51'
> > stats.field='{!sum=true }metric_52' stats.field='{!sum=true
> > }metric_53' stats.field='{!sum=true }metric_54'
> > stats.field='{!sum=true }metric_55' stats.field='{!sum=true
> > }metric_56' stats.field='{!sum=true }metric_57'
> > stats.field='{!sum=true }metric_58' stats.field='{!sum=true
> > }metric_59' stats.field='{!sum=true }metric_60'
> > stats.field='{!sum=true }metric_61' stats.field='{!sum=true
> > }metric_62' stats.field='{!sum=true }metric_63'
> > stats.field='{!sum=true }metric_64' stats.field='{!sum=true
> > }metric_65' stats.field='{!sum=true }metric_66'
> > stats.field='{!sum=true }metric_67' stats.field='{!sum=true
> > }metric_68' stats.field='{!sum=true }metric_69'
> > stats.field='{!sum=true }metric_70' stats.field='{!sum=true
> > }metric_71' stats.field='{!sum=true }metric_72'
> > stats.field='{!sum=true }metric_73' stats.field='{!sum=true
> > }metric_74' stats.field='{!sum=true }metric_75'
> > stats.field='{!sum=true }metric_76' stats.field='{!sum=true
> > }metric_77' stats.field='{!sum=true }metric_78'
> > stats.field='{!sum=true }metric_79' stats.field='{!sum=true
> > }metric_80' stats.field='{!sum=true }metric_81'
> > stats.field='{!sum=true }metric_82' stats.field='{!sum=true
> > }metric_83' stats.field='{!sum=true }metric_84'
> > stats.field='{!sum=true }metric_85' stats.field='{!sum=true
> > }metric_86' stats.field='{!sum=true }metric_87'
> > stats.field='{!sum=true }metric_88' stats.field='{!sum=true
> > }metric_89' stats.field='{!sum=true }metric_90'
> > stats.field='{!sum=true }metric_91' stats.field='{!sum=true
> > }metric_92' stats.field='{!sum=true }metric_93'
> > stats.field='{!sum=true }metric_94' stats.field='{!sum=true
> > }metric_95' stats.field='{!sum=true }metric_96'
> > stats.field='{!sum=true }metric_97' stats.field='{!sum=true
> > }metric_98' stats.field='{!sum=true }metric_99'
> > stats.field='{!sum=true }metric_100' stats.field='{!sum=true
> > }metric_101' stats.field='{!sum=true }metric_102'
> > stats.field='{!sum=true }metric_103' stats.field='{!sum=true
> > }metric_104'}
> >
> >
> >
> >
> > On Tue, Dec 12, 2017 at 10:21 AM, RAUNAK AGRAWAL <
> agrawal.raunak@gmail.com>
> > wrote:
> >
> >> Hi Yonik,
> >>
> >> I will try the JSON Facet API and update here but my hunch is that
> >> querying mechanism is not the problem. Rather the problem lies with the
> >> solr aggregations.
> >>
> >> Thanks
> >>
> >> On Tue, Dec 12, 2017 at 6:31 AM, Yonik Seeley <yseeley@gmail.com>
> wrote:
> >>
> >>> I think the SolrJ below uses the old stats component.
> >>> Hopefully the JSON Facet API would be faster for this, but it's not
> >>> completely clear what the main query here looks like, and if it's the
> >>> source of any bottleneck rather than the aggregations.
> >>> What does the generated query string actually look like (it may be
> >>> easiest just to pull this from the logs).
> >>>
> >>> -Yonik
> >>>
> >>>
> >>> On Mon, Dec 11, 2017 at 7:32 PM, RAUNAK AGRAWAL
> >>> <agrawal.raunak@gmail.com> wrote:
> >>> > Hi,
> >>> >
> >>> > We have a use case where there are 4-5 dimensions and around 3500
> >>> metrics
> >>> > in a single document. We have indexed only 2 dimensions and made all
> the
> >>> > metrics as doc_values so that we can run the aggregation queries.
> >>> >
> >>> > We have 6 million such documents and we are using solr cloud(6.6) on
> a 6
> >>> > node cluster with 8 Vcores and 24 GB RAM each.
> >>> >
> >>> > On the same set of hardware in elastic search we were getting the
> >>> response
> >>> > in about 10ms whereas in solr we are getting response in around
> 300-400
> >>> ms.
> >>> >
> >>> > This is how I am querying the data.
> >>> >
> >>> > private SolrQuery buildQuery(Integer variable1, List<Integer>
groups,
> >>> > List<String> metrics) {
> >>> >     SolrQuery query = new SolrQuery();
> >>> >     String groupQuery = " (" + groups.stream().map(g -> "group:"
+
> >>> g).collect
> >>> >             (Collectors.joining(" OR ")) + ")";
> >>> >     String finalQuery = "variable1:" + variable1 + " AND " +
> groupQuery;
> >>> >     query.set("q", finalQuery);
> >>> >     query.setRows(0);
> >>> >     metrics.forEach(
> >>> >             metric -> query.setGetFieldStatistics("{!sum=true }"
+
> >>> metric)
> >>> >     );
> >>> >     return query;
> >>> > }
> >>> >
> >>> > Any help will be appreciated regarding this.
> >>> >
> >>> >
> >>> > Thanks,
> >>> >
> >>> > Raunak
> >>>
> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message