lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Multi-value (complex) field indexing
Date Wed, 30 Dec 2009 16:33:51 GMT
You'll have one problem if you can't return a different increment gap,
you'll
match across rows.

Say you index row 1 with "aaa" "bbb" "ccc", then row two with
"ddd", "eee", "fff". Just adding multiple rows to a single document,
that document would match the phrase "ccc ddd".

I don't understand why you are able to index things but "have no
access to analyzer configuration", unless you're talking about an
index that's already been built. If you are indexing documents, it
seems to me that you *have* to be able to derive from one of the
standard analyzers and override getPositionIncrementGap,
that's all there is to providing a different value here.....

FWIW
Erick

On Tue, Dec 29, 2009 at 3:55 AM, Leonid M. <leonidms@gmail.com> wrote:

> You got me, thanks a lot.
> This is exactly I was trying to ask (meaning find this values within the
> row
> number 2).
>
> I'm afraid I'm not be able to proceed because I have no access to analyzer
> configuration (the system - JIRA 4.0) uses the hardcoded pre-configured set
> of default analyzers.
>
> Thanks a lot, you provided clear and brief answer to my question.
> --
> Best regards,
> Leonids Maslovs
>
>
>
> On Mon, Dec 28, 2009 at 10:51 PM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > I'm not following entirely here, but multi-valued fields are supported.
> > Something like (bad pseudo-code here)
> > doc = new Document
> > doc.add(new Field("rows", <row1 text", stored, indexed));
> > doc.add(new Field("rows", <row2 text", stored, indexed));
> > indexWriter.addDocument(doc);
> >
> > If your analyzer implements getPositionIncrementGap, you can keep your
> > rows separate (search the mail archives for getPositionIncrementGap for
> > more explanation).
> >
> > Then, if you're searching with a proximity (or phrase) less than your
> > increment
> > gap, you'll only get matches within a row. You wouldn't get a search
> > "against
> > a particular row" if, by that phrase, you meant "only look in row 2". If
> > you
> >
> > mean "only match the document if the phrase is in a *some* row in the
> doc",
> > it should work.
> >
> > The SpanNear query should work, as should regular phrase queries.
> >
> > If this is off base, could you provide some more examples?
> >
> > HTH
> > Erick
> >
> >
> >
> > On Mon, Dec 28, 2009 at 1:08 PM, Leonid M. <leonidms@gmail.com> wrote:
> >
> > > *Problem description*
> > >
> > >   - I have a complex multi-value field. So, each value consist from
> > several
> > >   rows.
> > >   - Each rows consists from several cells/items
> > >
> > >
> > > I want to be able to match those issues, which have a *row* with
> > > cellA="AAA"
> > > and cellB="BBB". Having a search by all the table (meaning - any row
> > > cellA="AAA" and any row cellB="BBB") is something I understand and
> > > hopefully
> > > could easily implement by having different FieldIndexers for each
> column.
> > >
> > > *Example:*
> > > *
> > > *
> > > *Field value:*
> > > *Row 1: [AAA] [BBB] [XXX]*
> > > *Row 2: [CCC] [BBB] [XXX]
> > > *
> > > I would like to run [CCC][BBB] query to look for rows containing these
> > > values and in this case I will get empty result set, since no such
> field
> > is
> > > present.
> > >
> > > *Question*
> > > Is there any option to index / look for particular row only?
> > >
> > > I understand that I could make each row to be a separate document, but
> it
> > > doesn't sit me (it's 3rd party system and I have no access for new
> > document
> > > creation, all I could do - to extend system's indexing mechanism).
> > >
> > > So, could I somehow index multiple rows within one document and
> construct
> > > Lucene search against some particular *row*?
> > >
> > > --
> > > Best regards,
> > > Leonids Malovs
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message