lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From prasenjit mukherjee <prasen....@gmail.com>
Subject Re: reusing the term-frequency count while indexing
Date Tue, 25 Oct 2011 09:19:53 GMT
On Tue, Oct 25, 2011 at 1:17 PM, Simon Willnauer
<simon.willnauer@googlemail.com> wrote:
> On Tue, Oct 25, 2011 at 5:08 AM, prasenjit mukherjee
> <prasen.bea@gmail.com> wrote:
>> Thats exactly I was trying to avoid :(
>>
>> I can afford to do that during indexing time, but it will be
>> time-consuming to do that at search time.
>
> hu? I don't understand, if you provide the terms at indexing time
> lucene keeps track of the term frequency etc. why would you want to do
> this at search time?

During search time I get the following input ( only for 1 field ) =
"solr:3 rocks:2 apache:1" . For this I have to create the lucene query
in the following way :  'solr solr solr rocks rocks apache'  This
approach becomes cumbersome with large value of  frequencies.

Is there a better approach than this ?

>
> simon
>>
>> On Mon, Oct 24, 2011 at 1:05 PM, Simon Willnauer
>> <simon.willnauer@googlemail.com> wrote:
>>> so you are saying you got (uniqueTerm, freq) tuples and you want to
>>> make lucene use this directly? I think the easiest way is to write a
>>> simple tokenFilter that emit the term X times where X is the term
>>> frequency. There is no easy way to pass these tuples to lucene
>>> directly.
>>>
>>> simon
>>>
>>> On Mon, Oct 24, 2011 at 3:28 AM, prasenjit mukherjee
>>> <prasen.bea@gmail.com> wrote:
>>>> Can you tell me how I can feed the lucene index by using the term
>>>> frequency directly ?
>>>>
>>>> Actually I am getting the documents along with their term-frequency
>>>> and don't want to write any additional code to expand them.
>>>>
>>>>
>>>> On 10/23/11, ppp c <peter.c.eric@gmail.com> wrote:
>>>>> Of curse, it can be reused.
>>>>> But from my point of view, it's meaningless, since the analysis process
has
>>>>> to be performed to collect such as prox, offset, or syno, payload and
so on.
>>>>>
>>>>> On Sun, Oct 23, 2011 at 11:22 PM, prasenjit mukherjee
>>>>> <prasen.bea@gmail.com>wrote:
>>>>>
>>>>>> I already have the term-frequency-count for all the terms  in a
>>>>>> document. Is there a way I can re-use that info while indexing. I
>>>>>> would like to use solr for this.
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Sent from my mobile device
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message