lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samarendra Pratap <>
Subject Sharding Techniques
Date Mon, 09 May 2011 11:56:32 GMT
Hi list,
 We have an index directory of 30 GB which is divided into 3 subdirectories
(idx1, idx2, idx3) which are again divided into 21 sub-subdirectories
(idx1-1, idx1-2, ...., idx2-1, ...., idx3-1, ...., idx3-21).

We are running with java 1.6, lucene 2.9 (going to upgrade to 3.1 very
soon), linux (fedora core - kernel 2.6.17-13.1), reiserfs.

We have almost 40 fields in each index (is it a bad to have so many
fields?). most of them are id based fields.
We are using 8 servers for search, and each of which receives approximately
3000/hour queries in peak hour and search time of more than 1 second is
considered bad (is it really bad?) as per the business requirement.

Since past few months we are experiencing issues (load and search time) on
our search servers, due to which I am looking for sharding techniques. Can
someone guide or give me pointers where i can read more and test?

Keeping parts of indexes on different servers search on all of them and then
merging the results - what could be the best approach?

Let me tell you that most queries use only 6-7 indexes and 4 - 5 fields (to
search for) but some queries (searching all the data) require all the
indexes and are primary cause of the performance degradation.

Any suggestions/ideas are greatly appreciated. And further more will
sharding (or similar thing) really reduce search time? (load is a less
severe issue when compared to search time)


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message