lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gili Nachum <gilinac...@gmail.com>
Subject MMapDirectory performance - Are searchable field values contiguously stored in FS block?
Date Sat, 26 Jan 2013 12:45:56 GMT
Hi,

I have a search workload that focuses on two fields in my 1GB index. I get
very good performance when loaded the index via MMapDirectory. I attribute
this performance to the Operating System File System (FS OS) cache, that
keeps the most recently used FS blocks RAM resident.

*I would like to add 50 more fields to the index, increasing it size to
~50GB, A key factor is that these additional fields will be queried very
rarely.
Given this increase in index size, should I expect lower Queries/Sec rate
for the original search workload (that doesn't use the new fields)?*

I would assume that if the values of each searchable field are stored in a
different set of FS blocks, then the 50 additional fields would make no
difference for the OS FS cache, as it would continue to behave like before,
keeping in RAM those most used FS blocks.
On the other hand, if values from different fields share the same FS
blocks, then the hot 2 fields values will be to scattered across the FS the
OS cache useless. degradating performance back to I/O bounded.

Which is the case with Lucene 3.6?

Thanks.
Gili Nachum.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message