lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Tf-Idf for a specific query
Date Sat, 08 Feb 2014 15:49:45 GMT
David:

If you're, say, faceting on fields with lots of unique values, this
will be quite expensive.
No idea whether you can tolerate slower queries or not, just sayin'....

Erick

On Fri, Feb 7, 2014 at 5:35 PM, David Miller <davthehacker@gmail.com> wrote:
> Thanks Mikhai,
>
> It seems that, this was what I was looking for. Being new to this, I wasn't
> aware of such a use of facets.
>
> Now I can probably combine the term vectors and facets to fit my scenario.
>
> Regards,
> Dave
>
>
> On Fri, Feb 7, 2014 at 2:43 PM, Mikhail Khludnev <mkhludnev@griddynamics.com
>> wrote:
>
>> David,
>>
>> I can imagine that "DF for resultset" is facets!
>>
>>
>> On Fri, Feb 7, 2014 at 11:26 PM, David Miller <davthehacker@gmail.com
>> >wrote:
>>
>> > Hi Mikhail,
>> >
>> > The DF seems to be based on the entire document set. What I require is
>> > based on a the results of a single query.
>> >
>> > Suppose my Solr query returns a set of 50K documents from a superset of
>> > 10Million documents, I require to calculate the DF just based on the 50K
>> > documents. But currently it seems to be calculated on the entire doc set.
>> >
>> > So, is there any way to get the DF or IDF just on basis of the docs
>> > returned by the query?
>> >
>> > Regards,
>> > Dave
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Fri, Feb 7, 2014 at 5:15 AM, Mikhail Khludnev <
>> > mkhludnev@griddynamics.com
>> > > wrote:
>> >
>> > > Hello Dave
>> > > you can get DF from http://wiki.apache.org/solr/TermsComponent (invert
>> > it
>> > > yourself)
>> > > then, for certain term you can get number of occurrences per document
>> by
>> > > http://wiki.apache.org/solr/FunctionQuery#tf
>> > >
>> > >
>> > >
>> > > On Fri, Feb 7, 2014 at 3:58 AM, David Miller <davthehacker@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Guys..
>> > > >
>> > > > I require to obtain Tf-idf score from Solr for a certain set of
>> > > documents.
>> > > > But the catch is that, I needs the IDF (or DF) to be calculated on
>> the
>> > > > documents returned by the specific query and not the entire corpus.
>> > > >
>> > > > Please provide me some hint on whether Solr has this feature or if
I
>> > can
>> > > > use the Lucene Api directly to achieve this.
>> > > >
>> > > >
>> > > > Thanks in advance,
>> > > > Dave
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Sincerely yours
>> > > Mikhail Khludnev
>> > > Principal Engineer,
>> > > Grid Dynamics
>> > >
>> > > <http://www.griddynamics.com>
>> > >  <mkhludnev@griddynamics.com>
>> > >
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Principal Engineer,
>> Grid Dynamics
>>
>> <http://www.griddynamics.com>
>>  <mkhludnev@griddynamics.com>
>>

Mime
View raw message