lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From alex stark <alex.st...@zoho.com>
Subject Re: Any way to improve document fetching performance?
Date Tue, 28 Aug 2018 09:11:19 GMT
I simple tried MultiDocValues.getBinaryValues to fetch result by doc value, it improves a lot,
2000 result takes only 5 ms. I even encode all the returnable fields to binary docvalues and
then decode them, the results is also good enough. It seems store field is not perform well....
In our scenario (I think it is more common nowadays), search phrase should return as many
results as possible so that rank phrase can resort the results by machine learning algorithm(on
other clusters). Fetching performance is also important. ---- On Tue, 28 Aug 2018 00:11:40
+0800 Erick Erickson <erickerickson@gmail.com> wrote ---- Don't use that call. You're
exactly right, it goes out to disk, reads the doc, decompresses it (16K blocks minimum per
doc IIUC) all just to get the field. 2,000 in 50ms actually isn't bad for all that work ;).
This sounds like an XY problem. You're asking how to speed up fetching docs, but not telling
us anything about _why_ you want to do this. Fetching 2,000 docs is not generally what Solr
was built for, it's built for returning the top N where N is usually < 100, most frequently
< 20. If you want to return lots of documents' data you should seriously look at putting
the fields you want in docValues=true fields and pulling from there. The entire Streaming
functionality is built on this and is quite fast. Best, Erick On Mon, Aug 27, 2018 at 7:35
AM <baris.kazar@oracle.com> wrote: > > can you post your query string? > >
Best > > > On 8/27/18 10:33 AM, alex stark wrote: > > In same machine, no net
latency. When I reduce to 500 limit, it takes 20ms, which is also slower than I expected.
btw, indexing is stopped. ---- On Mon, 27 Aug 2018 22:17:41 +0800 <baris.kazar@oracle.com>
wrote ---- yes, it should be less than a ms actually for those type of files. index and search
on the same machine? no net latency in between? Best On 8/27/18 10:14 AM, alex stark wrote:
> quite small, just serveral simple short text store fields. The total index size is around
1 GB (2m doc). ---- On Mon, 27 Aug 2018 22:12:07 +0800 <baris.kazar@oracle.com> wrote
---- Alex,- how big are those docs? Best regards On 8/27/18 10:09 AM, alex stark wrote: >
Hello experts, I am wondering is there any way to improve document fetching performance, it
appears to me that visiting from store field is quite slow. I simply tested to use indexsearch.doc()
to get 2000 document which takes 50ms. Is there any idea to improve that? ---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail:
java-user-help@lucene.apache.org ---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail:
java-user-help@lucene.apache.org > > > > ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands,
e-mail: java-user-help@lucene.apache.org > ---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail:
java-user-help@lucene.apache.org
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message