lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Bacon <neil.ba...@nicta.com.au>
Subject Re: AW: Analyzing suggester for many fields
Date Thu, 12 Jun 2014 23:48:01 GMT
Hi Clemens,
Goutham's code is at: https://github.com/gtholpadi/MyLucene
I'm doing something similar, adding weighting as some function of doc 
freq (and using Scala).
Cheers,
Neil

On 13/06/14 00:19, Clemens Wyss DEV wrote:
> enter InputIteratorWrapper ;) i.e. new InputIteratorWrapper(tfit )
>
> -----Ursprüngliche Nachricht-----
> Von: Clemens Wyss DEV [mailto:clemensdev@mysign.ch]
> Gesendet: Donnerstag, 12. Juni 2014 16:01
> An: java-user@lucene.apache.org
> Betreff: AW: Analyzing suggester for many fields
>
> trying to re-build  the multi-field TermFreqIterator based on the Goutham's initial code
>
> 		TermFreqIteratorWrapper tfit = null;
> 		for (AtomicReaderContext readerc : readercs)
> 		{
> 		     Fields fields = readerc.reader().fields();
> 		     for (String field : fields)
> 		     {
> 		         TermsEnum termsEnum = fields.terms( field ).iterator(null);
> 		         tfit = new TermFreqIteratorWrapper(termsEnum);  // OVERWRITE!
> 		     }
> 		 }
> 		AnalyzingSuggester suggr = new AnalyzingSuggester( IndexManager.getIndexingAnalyzer(
locale ) );
> 		suggr.build( tfit );
>
> BUT AnalyzingSuggester#build requires an InputIterator and not TermFreqIteratorWrapper/BytesRefIterator.
Did this change from 4.4 to 4.7.2?
>
> -----Ursprüngliche Nachricht-----
> Von: Clemens Wyss DEV [mailto:clemensdev@mysign.ch]
> Gesendet: Mittwoch, 11. Juni 2014 12:57
> An: java-user@lucene.apache.org
> Betreff: AW: Analyzing suggester for many fields
>
> Unfortunately the link provided by Goutham is no more valid. Anybody still got the code?
>
> -----Ursprüngliche Nachricht-----
> Von: Goutham Tholpadi [mailto:gtholpadi@gmail.com]
> Gesendet: Donnerstag, 29. August 2013 06:21
> An: java-user@lucene.apache.org
> Betreff: Re: Analyzing suggester for many fields
>
> I implemented a simple TermFreqIterator for wrapping Iterator-s from multiple fields,
or from multiple AtomicReaders under an IndexReader.
> It seems to work for me. In case anyone else wants to use a quick-fix, here it is: http://pastebin.com/Hm2zW9xR
.
>
> Goutham Tholpadi
> https://sites.google.com/site/gtholpadi/
>
>
> On Tue, Aug 27, 2013 at 10:05 PM, Goutham Tholpadi <gtholpadi@gmail.com> wrote:
>> Field-specific suggestions is a good idea that I had not thought
>> about. Thanks Mike, for the answer, and the suggestion! :)
>>
>> Goutham Tholpadi
>> https://sites.google.com/site/gtholpadi/
>>
>>
>> On Tue, Aug 27, 2013 at 9:28 PM, Michael McCandless
>> <lucene@mikemccandless.com> wrote:
>>> I think you should just implement your own TermFreqIterator, that
>>> wraps/delegates each of the N fields in turn?
>>>
>>> But, at suggestion time, do you need per-field suggestions?  If so,
>>> maybe you should build a separate suggester for each field...
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>>
>>> On Tue, Aug 27, 2013 at 9:19 AM, Goutham Tholpadi <gtholpadi@gmail.com>
wrote:
>>>> Lucene Version : 4.4.0
>>>>
>>>> SITUATION :
>>>>
>>>> I need to suggest terms to the user based on a query prefix typed in
>>>> a textbox. The terms suggested should exist in the index that will
>>>> be searched. I want to suggest terms from more than one field in the
>>>> index.
>>>>
>>>> I am trying to use
>>>> org.apache.lucene.search.suggest.analyzing.AnalyzingSuggester for
>>>> this.
>>>>
>>>> PROBLEM :
>>>>
>>>> Given the path to the index directory, I proceed as follows.
>>>> ---------------------------------------------------------------
>>>> IndexReader ireader = DirectoryReader.open(FSDirectory.open(new
>>>> File(indexPath)));
>>>> List<AtomicReaderContext> readercs = ireader.leaves(); for
>>>> (AtomicReaderContext readerc : readercs) {
>>>>      Fields fields = readerc.reader().fields();
>>>>      for (String field : fields) {
>>>>          TermsEnum termsEnum = fields.terms(field).iterator(null);
>>>>          tfit = new TermFreqIteratorWrapper(termsEnum);  // OVERWRITE!
>>>>      }
>>>> }
>>>> AnalyzingSuggester suggr = new AnalyzingSuggester(analyzer);
>>>> suggr.build(tfit);
>>>> ---------------------------------------------------------------
>>>>
>>>> In the line marked "OVERWRITE!", I am overwriting the term list from
>>>> one field with the term list from the next field. I want to
>>>> aggregate term lists obtained from different fields.
>>>>
>>>> I could not find a way to instantiate TermFreqIterator without a
>>>> BytesRefIterator. The only way to get a BytesRefIterator seems to be
>>>> in the form of TermsEnum. Neither can be changed (i.e. appended to)
>>>> after instantiation.
>>>>
>>>> How can I aggregate the TermsEnum lists from different fields so
>>>> that I can pass them together in one shot to build()? Alternatively,
>>>> is there a way to add term lists to the suggester after calling
>>>> build() once?
>>>>
>>>> Thanks!
>>>>
>>>> --------------------------------------------------------------------
>>>> - To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> B�KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB��[��X��ܚX�KK[XZ[��]�K]\�\�][��X��ܚX�PX�[�K�\X�K�ܙ�B��܈Y][ۘ[��[X[��K[XZ[��]�K]\�\�Z[X�[�K�\X�K�ܙ�B�B


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message