lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kumaran Ramasubramanian <kums....@gmail.com>
Subject Re: Indexing and storing Long fields
Date Thu, 28 Jul 2016 16:12:09 GMT
Ok mike.. thanks for the explanation... i have another doubt...

i read in some article like, we can have one storedfield & docvalue field
with same field... is it so?


--
Kumaran R






On Thu, Jul 28, 2016 at 9:29 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> OK, sorry, you cannot change how the field is indexed for the same field
> name across different field indices.
>
> Lucene will "downgrade" that field to the lowest settings, e.g. "docs, no
> positions" in your case.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Thu, Jul 28, 2016 at 9:31 AM, Kumaran Ramasubramanian <
> kums.134@gmail.com
> > wrote:
>
> > Hi Mike,
> >
> >  For your information, am using lucene 4.10.4.. am i missing anything?
> >
> >
> >
> > ​--
> > Kumaran R​
> >
> >
> >
> >
> > On Wed, Jul 27, 2016 at 1:52 AM, Kumaran Ramasubramanian <
> > kums.134@gmail.com
> > > wrote:
> >
> > >
> > > Hi Mike,
> > >
> > > 1.if we index one field as analyzed and not analyzed using same name,
> > > phrase queries are not working (field "comp" was indexed without
> position
> > > data, cannot run phrasequery) for analyzed terms also... because
> indexed
> > > document ( term properties are not proper, even if tokenized, not able
> to
> > > search "bank" or "swiss" or "world") looks like
> > >
> > > *while we index*
> > >
> > > Document<*stored,indexed**,tokenized**<comp:world bank*>
> > > stored,indexed,tokenized<name:kumaran >
> > > stored,indexed,tokenized<city:chennai>
> stored,indexed,tokenized<module:1>
> > > stored,indexed,tokenized<docid:1>>
> > > Document<*stored,indexed<comp:swiss bank*>
> > > stored,indexed,tokenized<name:kumaran >
> > > stored,indexed,tokenized<city:chennai>
> stored,indexed,tokenized<module:1>
> > > stored,indexed,tokenized<docid:2>>
> > >
> > >
> > > *in index*
> > >
> > > Document<*stored,indexed**,tokenized**<comp:world bank*>
> > > stored,indexed,tokenized<name:kumaran >
> > > stored,indexed,tokenized<city:chennai>
> stored,indexed,tokenized<module:1>
> > > stored,indexed,tokenized<docid:1>>
> > > Document<*stored,indexed,tokenized<comp:swiss bank*>
> > > stored,indexed,tokenized<name:kumaran >
> > > stored,indexed,tokenized<city:chennai>
> stored,indexed,tokenized<module:1>
> > > stored,indexed,tokenized<docid:2>>
> > >
> > > *impact:*
> > >
> > > *stored,indexed is changed to **stored,indexed**,tokenized*
> > >
> > > *Related links:*
> > >
> > > *https://github.com/elastic/elasticsearch/issues/12079
> > > <https://github.com/elastic/elasticsearch/issues/12079>*
> > >
> > > *https://github.com/elastic/elasticsearch/issues/4475
> > > <https://github.com/elastic/elasticsearch/issues/4475>*
> > >
> > > *
> >
> http://stackoverflow.com/questions/19302887/elasticsearch-field-title-was-indexed-without-position-data-cannot-run-phras
> > > <
> >
> http://stackoverflow.com/questions/19302887/elasticsearch-field-title-was-indexed-without-position-data-cannot-run-phras
> > >*
> > >
> > >
> > >
> > > *2.similarly, for numeric field & string field using same field*
> > >
> > > Also, if we index numeric & stringfield using same field name in single
> > > index, we do lose position data of indexed string terms and so phrase
> > > queries not working ( field  "fieldname" was indexed without position
> > > data, cannot run phrasequery)
> > >
> > >
> > >
> > >
> >
> https://mail-archives.apache.org/mod_mbox/lucene-java-user/201510.mbox/%3CCAHTScUgTYgSLP9OmoMe2ebVBHw8=Trih5B++u7V050VNRQZU8A@mail.gmail.com%3E
> > >
> > >
> > >
> > > > I would be pretty skeptical of this approach You're
> > >
> > > > mixing numeric data with textual data and I expect
> > >
> > > > the results to be unpredictable. You already said
> > >
> > > > "it is working for most of the
> > >
> > > > documents except one or two documents." I predict
> > >
> > > > you'll find more and more of these as time passes.
> > >
> > > >
> > >
> > > > Expect many more anomalies. At best you need to
> > >
> > > > index both forms as text rather than mixing numeric
> > >
> > > > and text data.
> > >
> > >
> > >
> > > Thanks in advance...
> > >
> > >
> > >
> > > --
> > > Kumaran R
> > >
> > >
> > >
> > >
> > >
> > > On Sun, Jul 24, 2016 at 1:54 AM, Michael McCandless <
> > > lucene@mikemccandless.com> wrote:
> > >
> > >> On Sat, Jul 23, 2016 at 4:48 AM, Kumaran Ramasubramanian <
> > >> kums.134@gmail.com
> > >> > wrote:
> > >>
> > >> > Hi Mike,
> > >> >
> > >> > *Two different fields can be the same name*
> > >> >
> > >> > Is it so? You mean we can index one field as docvaluefield and also
> > >> stored
> > >> > field, Using same name?
> > >> >
> > >>
> > >> This should be fine, yes.
> > >>
> > >>
> > >> > And AFAIK, We cannot index one field as analyzed and not analyzed
> > using
> > >> the
> > >> > same name. Am i right?
> > >> >
> > >>
> > >> Hmm, I think you can do this?  The first one will be tokenized, and
> the
> > >> second indexed as a single token.
> > >>
> > >> Or do you see otherwise?
> > >>
> > >> Mike McCandless
> > >>
> > >> http://blog.mikemccandless.com
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message