lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Multiple Keywords/Keyphrases fields
Date Sat, 12 Feb 2005 22:09:15 GMT
The real question to answer is what types of queries you're planning on 
making.  Rather than look at it from indexing forward, consider it from 
searching backwards.

How will users query using those keyword phrases?

	Erik

On Feb 12, 2005, at 3:08 PM, Owen Densmore wrote:

> I'm getting a bit more serious about the final form of our lucene 
> index.  Each document has DocNumber, Authors, Title, Abstract, and 
> Keywords.  By Keywords, I mean a comma separated list, each entry 
> having possibly many terms in a phrase like:
> 	temporal infomax, finite state automata, Markov chains,
> 	conditional entropy, neural information processing
>
> I presume I should be using a field "Keywords" which have many 
> "entries" or "instances" per document (one per comma separated 
> phrase).  But I'm not sure the right way to handle all this.  My 
> assumption is that I should analyze them individually, just as we do 
> for free text (the Abstract, for example), thus in the example above 
> having 5 entries of the nature
> 	doc.add(Field.Text("Keywords", "finite state automata"));
> etc, analyzing them because these are author-supplied strings with no 
> canonical form.
>
> For guidance, I looked in the archive and found the attached email, 
> but I didn't see the answer.  (I'm not concerned about the dups, I 
> presume that is equivalent to a boos of some sort) Does this seem 
> right?
>
> Thanks once again.
>
> Owen
>
>> From: lucene@nitwit.de <lucene@nitwit.de>
>> Subject: Multiple equal Fields?
>> Date: Tue, 17 Feb 2004 12:47:58 +0100
>>
>> Hi!
>> What happens if I do this:
>>
>> doc.add(Field.Text("foo", "bar"));
>> doc.add(Field.Text("foo", "blah"));
>>
>> Is there a field "foo" with value "blah" or are there two "foo"s 
>> (actually not
>> possible) or is there one "foo" with the values "bar" and "blah"?
>>
>> And what does happen in this case:
>>
>> doc.add(Field.Text("foo", "bar"));
>> doc.add(Field.Text("foo", "bar"));
>> doc.add(Field.Text("foo", "bar"));
>>
>> Does lucene store this only once?
>>
>> Timo
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message