lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From KEGan <khoon.ee....@gmail.com>
Subject Re: How to combine multiple fields to a single field for indexing
Date Sat, 26 Aug 2006 09:11:43 GMT
Erik,

"Given the position increment gap between instances of same-named
fields that is now part of Lucene, I recommend using multiple field
instances instead."

Did you mean ... recommend "NOT" using multiple field ?

If we want to do query like "name:John" or boasting of Fields ... then we
have to use multiple field instances, right ?


On 8/24/06, Erik Hatcher <erik@ehatchersolutions.com> wrote:
>
> Yeah, I used a cruder form by appending all the text together into a
> single string with a space separator in that LIA example.
>
> Given the position increment gap between instances of same-named
> fields that is now part of Lucene, I recommend using multiple field
> instances instead.
>
>        Erik
>
>
>
> On Aug 24, 2006, at 3:05 AM, Gopikrishnan Subramani wrote:
> > Erik's has used a space as the field separator. May be you can use a
> > different field separator that your analyzer won't eat up, so that
> > will
> > change the token position by 1.
> >
> > Gopi
> >
> > On 8/24/06, KEGan <khoon.ee.gan@gmail.com> wrote:
> >>
> >> Erik,
> >>
> >> What is generally the reason for indexing both individual fields,
> >> and the
> >> general-purpose "content" field ?
> >>
> >> Also, if we search in the general-purpose "content" field, wouldnt
> >> this
> >> problem occurs. Let say we have 2 fields and the following values:
> >>
> >> name : John Smith
> >> food  : subway sandwich
> >>
> >> So the general-purpose "content" would have the following values:
> >>
> >> John Smith subway sandwich
> >>
> >> Hence, if the user search for "smith subway" (with quotation), the
> >> said
> >> document will be returned. On the other hand, if both fields were
> >> indexed
> >> seperately, this document would not be returned, since there is no
> >> field
> >> that contain the value "smith subway".
> >>
> >> How do we go about this problem ?
> >>
> >>
> >> On 8/24/06, Erik Hatcher <erik@ehatchersolutions.com> wrote:
> >> >
> >> >
> >> > On Aug 23, 2006, at 11:36 AM, Suba Suresh wrote:
> >> > > In "Lucene In Action" book it says it is better practice to
> >> combine
> >> > > two fields into one field and index it than use the
> >> > > MultiFieldQueryParser. Do I initially index both the fields and
> >> > > then index them again together? When I index them together do I
> >> > > index the fieldnames or values? Can someone give me an example of
> >> > > how to do it?
> >> >
> >> > What I do is simply index all the fields individually that need
> >> to be
> >> > searchable or just stored, but also index a general-purpose
> >> > "contents" field with all of that same text.
> >> >
> >> > You can add multiple fields of the same name to a document,
> >> making it
> >> > easy to just keep appending to a "contents" field for a document.
> >> > You can see how this is done in the Lucene in Action code in the
> >> > TestDataDocumentHandler.java - however I took a cruder approach and
> >> > appended the fields together with a space in between them rather
> >> than
> >> > using the multiple valued field approach.  Either technique will
> >> work
> >> > just fine.
> >> >
> >> >        Erik
> >> >
> >> >
> >> >
> >> ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >
> >> >
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message