lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From suriya prakash <>
Subject Indexing architecture
Date Wed, 28 Dec 2016 17:57:49 GMT

I have 100 thousand indexes in Hadoop grid because 90% of my indexes will
be inactive and I can distribute the other active indexes based on load.
Scoring will work better for each index but I don't worry about it now.

What are the optimisations I need to do to Scale better?

I do commit every time now. Should i work on keeping active index writer
open and commit periodically with wal for failures.

Update calls will happen frequently (80% load). I will read stored fields
and update the existing document with new value. I don't compress
storedfields now, because it has to uncompress block of data. Should I
reconsider compression?

Scale: 100s of indexes will be active at a time in a single machine(16gb

should I have to change to shard based architecture?
I see some benefits there more batching will happen, multiple threads will
not load the system. What other benefits can we get?

Please share your ideas/any link for multi user environment.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message