lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Claudia Grieco" <gri...@crmpa.unisa.it>
Subject R: Retrieve found keywords from document
Date Thu, 25 Nov 2010 16:48:03 GMT
Thanks a lot.
I used the lucene analyzer to parse the profile and everything works :)

-----Messaggio originale-----
Da: Ian Lea [mailto:ian.lea@gmail.com] 
Inviato: giovedì 25 novembre 2010 14.52
A: java-user@lucene.apache.org
Oggetto: Re: Retrieve found keywords from document

You could parse the output from the lucene analyzer that you are using
to get hold of a list of terms and pick the ones that are hobbies.  Or
do it outside lucene using whatever string parsing technique you like.

Or take a look at the recent thread on this list on a similar topic:
"High frequency term for the searched query".


--
Ian.


On Thu, Nov 25, 2010 at 1:27 PM, Claudia Grieco <grieco@crmpa.unisa.it>
wrote:
> What I call "profile" is free text (extracted from a pdf) and not the
result
> of the user listing hobbies in a form
> So to store hobbies in a field called "hobbies" I have to extract hobbies
> from text first...is it possible to do it using Lucene?
>
> -----Messaggio originale-----
> Da: Ian Lea [mailto:ian.lea@gmail.com]
> Inviato: giovedì 25 novembre 2010 13.01
> A: java-user@lucene.apache.org
> Oggetto: Re: Retrieve found keywords from document
>
> Can't you just store the hobbies as standard stored fields
> (Field.Store.YES), or as a single field, call doc.get("hobbies") and
> do what you want with them?
>
> This sounds rather like faceting - if so you might want to consider
> using Solr.  http://wiki.apache.org/solr/SolrFacetingOverview
>
>
> --
> Ian.
>
> On Thu, Nov 25, 2010 at 11:48 AM, Claudia Grieco <grieco@crmpa.unisa.it>
> wrote:
>> Hi guys,
>>
>> I have this problem:
>>
>> I'm using Lucene to create a search engine on people profiles.
>>
>> I have a set of hobbies (let's say {"reading" , "singing"} for example)
>  and
>> I want to find people who have at least one of these hobbies AND which of
>> these hobbies they have.
>>
>> Currently I search for each one of these hobbies (ex, one search for
>> reading, one search for singing) but since the list of hobbies is very
> long
>> (200+) I'd like to do the following:
>>
>>
>>
>> 1)Do ONE search that finds all the documents who have at least an hobby
in
>> the text ( this is easily accomplished using BooleanQuery)
>>
>> 2)For each document, retrieve the keywords found.
>>
>>
>>
>> Do you have any ideas on how to do n# 2?
>>
>> Thank you
>>
>> Claudia
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message