lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Atri Sharma <a...@apache.org>
Subject Re: Sampled Queries -- Use Cases and Feedback
Date Mon, 10 Jun 2019 06:53:38 GMT
Any thoughts on this? I am envisioning applications to machine
learning systems, where the training dataset might be a small sample
of the entire dataset, and the user wants scoring to be done only on
samples of the dataset.

On Fri, Jun 7, 2019 at 5:45 PM Atri Sharma <atri@apache.org> wrote:
>
> Hi All,
>
> While working on a new Query type, I was inclined to think of a couple
> of use cases where the documents being scored need not be all of the
> data set, but a sample of them. This can be useful for very large
> datasets, where a query is only interested in getting the "feel" of
> the data, and other queries where the data is being aggregated over
> time, so a wide enough sample of the data is good enough for the user
> at the tradeoff of improved performance. Faceting already has sampling
> mechanisms, so there are ideas to be borrowed from that part.
>
> I have some ideas on introducing a new query type and associated
> semantics to allow this functionality to be present from ground up.
> Specifically, a query type which wraps another query and "feeds"
> offsets to the inner query, along with a limit of collection of hits.
> I can go in more detail, but wanted to get some thoughts and feedback
> before delving deeper.
>
> Atri



-- 
Regards,

Atri
Apache Concerted

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message