lucene-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mosh (Jira)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-13759) Optimize Queries when query filtering by TRA router.field
Date Tue, 12 Nov 2019 12:18:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972329#comment-16972329
] 

mosh commented on SOLR-13759:
-----------------------------

Example:
 Given a times based data stored in TRA (say IOT signals) - today, querying a specific *date
range* will query *all* TRA collections (rather than relevant collections that potentially
holding the desired data) then on each collection we filter by the specified field.
 If the specified fq date range date field is the router.field I propose an optimization to
today's behavior by *filtering* out irrelevant collections before even querying them.
 In HttpSolrCall#init:279:
{code:java}
collectionsList = resolveCollectionListOrAlias(queryParams.get(COLLECTION_PROP, def));{code}
collectionsList filtering will look somewhat like the below:
{code:java}
collectionsList  = collectionsList.stream().filter(collectionName->isDateInRange(fqDateRange,
collectionName)).collect(Collectors.toList());
{code}
 Using this practice we are avoiding redundant queries to collections that we are 100% sure
that doesn't store the relevant data.

> Optimize Queries when query filtering by TRA router.field
> ---------------------------------------------------------
>
>                 Key: SOLR-13759
>                 URL: https://issues.apache.org/jira/browse/SOLR-13759
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: mosh
>            Priority: Minor
>
> We are currently testing TRA using Solr 7.7, having >300 shards in the alias, with
much growth in the coming months.
> The "hot" data(in our case, more recent) will be stored on stronger nodes(SSD, more RAM,
etc).
> A proposal of optimizing queries will be by filtering query by date range, by that we
will be able to querying the specific TRA collections taking advantage of the TRA mechanism
of partitioning data based on date.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


Mime
View raw message