lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Roduit <nicolas.rod...@gmail.com>
Subject Re: Handle expression in the index
Date Fri, 08 Feb 2013 07:33:53 GMT
Thanks for your prompt reply. I've implemented the prospective search 
and It works correctly now.

Nicolas

Le 06. 02. 13 10:13, Nicolas Roduit a écrit :
> hi Nicolas ,
> if i understand correctly what you are describing is that your tag field
> will contain Lucine queries syntax  - one word = exact match , 2 words "xx
> yy" = phrase match , and so on .
>
> there is a search method called "Prospective search" which fits this
> situation .
>
> you can try and use this procedure :
>
> 1.run a query searching for a tag  (e.g. "first time")  the ScoreDocs you
> get from the search will contain the "potential results" (potential  since
> it will also contain tags with only one word from the phrase "time" or
> "first" and not necessarily both).
>
> 2.iterate over the score doc and :
> 2.1 create a single in memory document index that will hold your original
> search term ("first time ") , use the *MemoryIndex *object its super fast
> and perfect for this type of search .
> 2.2 create a Lucune query out of the tag you are currently iterating on.
> 2.3 run the query you created in the previous step on the index you created
> on step 2.1 if you got a hit that means that the tag matches your search
> term and you can collect the text from that doc .
>
> the above procedure works and it is quite fast (depending on how many
> "potential results " results you get from your first search ) .
>
> you can also read this blog which has an example :
>
> http://www.sajalkayan.com/prospective-search-using-python.html
>
> if some one has a batter approach to this issue , i would love to here it
> as well.
>
> Alon
>
>
>
> On Wed, Feb 6, 2013 at 11:13 AM, Nicolas Roduit <nicolas.roduit@gmail.com>wrote:
>
> > I'm starting with Lucene 4 and have built my own analyzer with stemming
> > and synonyms. This works perfectly.
> >
> > I built a Lucene index with several documents (with an ID) containing a
> > text (with TextField)  and a list of words or expressions related to the
> > text (a kind of tag). Everything is OK when I make a query containing one
> > of these words (tags), I find the related text. How can I proceed if I want
> > to have a tag that contains several words (e.g. "first time"). This
> > expression must not be separated in two words. The problem is when I make a
> > query with the word "first" I will get the document in the hits, I would
> > like to get the hit only when search for "first time".
> >
> > Can someone give me a clue?
> >
> >
> >
> > ------------------------------**------------------------------**---------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<java-user-unsubscribe@lucene.apache.org>
> > For additional commands, e-mail: java-user-help@lucene.apache.**org<java-user-help@lucene.apache.org>
> >
> >
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message