lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonghui Zhao <zhaoyong...@gmail.com>
Subject Re: what's replacement of FieldCache in Lucene 7
Date Fri, 13 Apr 2018 11:24:17 GMT
Got it, make sense.   Thanks Adrien.

2018-04-13 19:16 GMT+08:00 Adrien Grand <jpountz@gmail.com>:

> Queries should be fine: they are required to produce sorted iterators since
> 5.0 when we removed the accetDocsOutOfOrder option on collectors.
>
> Le ven. 13 avr. 2018 à 13:10, Yonghui Zhao <zhaoyonghui@gmail.com> a
> écrit :
>
> > I can sort doc id and then fetch field via docvalue.
> >
> > but another big scenario for field cache is  in custom score query, we
> use
> > field cache to compute score, stored fields can't work here for
> performance
> > issue.
> >
> > If I still use docvalue, I must make sure all queries are scored in
> order,
> > I think this will introduce some performance drop?
> >
> > 2018-04-13 17:15 GMT+08:00 Adrien Grand <jpountz@gmail.com>:
> >
> > > Performance may be worse with stored fields indeed. In general Lucene
> > makes
> > > the assumption that millions of documents are queried but only ~100
> > > documents are retrieved in the end, so the bottleneck should be query
> > > processing, not retrieving stored fieds.
> > >
> > > Le ven. 13 avr. 2018 à 05:27, Yonghui Zhao <zhaoyonghui@gmail.com> a
> > > écrit :
> > >
> > > > My case is when I get some docs from lucene, I need also get some
> field
> > > > value of the retrieved docs.
> > > >
> > > > For example  in lucene 4, I use FieldCache like this.
> > > >
> > > > FieldCache.DEFAULT.getTerms(reader, name,
> > > > false).get(locDocId).utf8ToString();
> > > >
> > > > FieldCache.DEFAULT.getInts(reader, name, false).get(locDocId);
> > > >
> > > > FieldCache.DEFAULT.getDoubles(reader, name, false).get(locDocId);
> > > >
> > > >
> > > > while docId may be not in ascending order.
> > > >
> > > > Of course I can use stored field like this
> > > >
> > > > Document doc = indexSearcher.doc(docId, storedFields.keySet());
> > > >
> > > >
> > > > But the performance should be worse than FieldCache.
> > > >
> > > >
> > > > 2018-04-12 19:57 GMT+08:00 Adrien Grand <jpountz@gmail.com>:
> > > >
> > > > > Hello,
> > > > >
> > > > > Doc values should be used instead of the field cache indeed. Note
> > that
> > > > this
> > > > > require to add them to your documents at index time, eg. with a
> > > > > NumericDocValuesField.
> > > > >
> > > > > Regarding random access, maybe you can use the advanceExact API,
> > which
> > > > > exists on all doc-value iterators. Just make sure to never call it
> on
> > > > > decreasing doc IDs. If that doesn't work for you, can you describe
> > you
> > > > > use-case, maybe there are better ways to implement what you need.
> > > > >
> > > > > Le jeu. 12 avr. 2018 à 13:54, Yonghui Zhao <zhaoyonghui@gmail.com>
> a
> > > > > écrit :
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I am upgrading my project from Lucene 4 to 7.
> > > > > >
> > > > > > FieldCache is removed in lucene 7,  DocValue is replacement?
> > > > > >
> > > > > > But seems DocValue doesn't support random access.
> > > > > >
> > > > > > I need random access to get some specified field value quickly.
> > > > > >
> > > > > > So how to solve it?
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message