lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Indexing and storing Long fields
Date Thu, 28 Jul 2016 15:59:30 GMT
OK, sorry, you cannot change how the field is indexed for the same field
name across different field indices.

Lucene will "downgrade" that field to the lowest settings, e.g. "docs, no
positions" in your case.

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jul 28, 2016 at 9:31 AM, Kumaran Ramasubramanian <kums.134@gmail.com
> wrote:

> Hi Mike,
>
>  For your information, am using lucene 4.10.4.. am i missing anything?
>
>
>
> ​--
> Kumaran R​
>
>
>
>
> On Wed, Jul 27, 2016 at 1:52 AM, Kumaran Ramasubramanian <
> kums.134@gmail.com
> > wrote:
>
> >
> > Hi Mike,
> >
> > 1.if we index one field as analyzed and not analyzed using same name,
> > phrase queries are not working (field "comp" was indexed without position
> > data, cannot run phrasequery) for analyzed terms also... because indexed
> > document ( term properties are not proper, even if tokenized, not able to
> > search "bank" or "swiss" or "world") looks like
> >
> > *while we index*
> >
> > Document<*stored,indexed**,tokenized**<comp:world bank*>
> > stored,indexed,tokenized<name:kumaran >
> > stored,indexed,tokenized<city:chennai> stored,indexed,tokenized<module:1>
> > stored,indexed,tokenized<docid:1>>
> > Document<*stored,indexed<comp:swiss bank*>
> > stored,indexed,tokenized<name:kumaran >
> > stored,indexed,tokenized<city:chennai> stored,indexed,tokenized<module:1>
> > stored,indexed,tokenized<docid:2>>
> >
> >
> > *in index*
> >
> > Document<*stored,indexed**,tokenized**<comp:world bank*>
> > stored,indexed,tokenized<name:kumaran >
> > stored,indexed,tokenized<city:chennai> stored,indexed,tokenized<module:1>
> > stored,indexed,tokenized<docid:1>>
> > Document<*stored,indexed,tokenized<comp:swiss bank*>
> > stored,indexed,tokenized<name:kumaran >
> > stored,indexed,tokenized<city:chennai> stored,indexed,tokenized<module:1>
> > stored,indexed,tokenized<docid:2>>
> >
> > *impact:*
> >
> > *stored,indexed is changed to **stored,indexed**,tokenized*
> >
> > *Related links:*
> >
> > *https://github.com/elastic/elasticsearch/issues/12079
> > <https://github.com/elastic/elasticsearch/issues/12079>*
> >
> > *https://github.com/elastic/elasticsearch/issues/4475
> > <https://github.com/elastic/elasticsearch/issues/4475>*
> >
> > *
> http://stackoverflow.com/questions/19302887/elasticsearch-field-title-was-indexed-without-position-data-cannot-run-phras
> > <
> http://stackoverflow.com/questions/19302887/elasticsearch-field-title-was-indexed-without-position-data-cannot-run-phras
> >*
> >
> >
> >
> > *2.similarly, for numeric field & string field using same field*
> >
> > Also, if we index numeric & stringfield using same field name in single
> > index, we do lose position data of indexed string terms and so phrase
> > queries not working ( field  "fieldname" was indexed without position
> > data, cannot run phrasequery)
> >
> >
> >
> >
> https://mail-archives.apache.org/mod_mbox/lucene-java-user/201510.mbox/%3CCAHTScUgTYgSLP9OmoMe2ebVBHw8=Trih5B++u7V050VNRQZU8A@mail.gmail.com%3E
> >
> >
> >
> > > I would be pretty skeptical of this approach You're
> >
> > > mixing numeric data with textual data and I expect
> >
> > > the results to be unpredictable. You already said
> >
> > > "it is working for most of the
> >
> > > documents except one or two documents." I predict
> >
> > > you'll find more and more of these as time passes.
> >
> > >
> >
> > > Expect many more anomalies. At best you need to
> >
> > > index both forms as text rather than mixing numeric
> >
> > > and text data.
> >
> >
> >
> > Thanks in advance...
> >
> >
> >
> > --
> > Kumaran R
> >
> >
> >
> >
> >
> > On Sun, Jul 24, 2016 at 1:54 AM, Michael McCandless <
> > lucene@mikemccandless.com> wrote:
> >
> >> On Sat, Jul 23, 2016 at 4:48 AM, Kumaran Ramasubramanian <
> >> kums.134@gmail.com
> >> > wrote:
> >>
> >> > Hi Mike,
> >> >
> >> > *Two different fields can be the same name*
> >> >
> >> > Is it so? You mean we can index one field as docvaluefield and also
> >> stored
> >> > field, Using same name?
> >> >
> >>
> >> This should be fine, yes.
> >>
> >>
> >> > And AFAIK, We cannot index one field as analyzed and not analyzed
> using
> >> the
> >> > same name. Am i right?
> >> >
> >>
> >> Hmm, I think you can do this?  The first one will be tokenized, and the
> >> second indexed as a single token.
> >>
> >> Or do you see otherwise?
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message