lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "robert engels (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-893) Increase buffer sizes used during searching
Date Sat, 26 May 2007 16:50:16 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499332
] 

robert engels commented on LUCENE-893:
--------------------------------------

Some food for thought:

A couple of runs of XBench on hardware that is radically difference in terms of raw performance
shows a nearly 4x performance improvement using 256k blocks during sequential access. For
random reads the numbers are closer to 20x.

The trick is determining how much sequential data is (should) be read - the locality of data
for the current query along with future queries, since even if Lucene reads extra unneeded
data in this run, what is the chance that the data will be needed in future queries (thus
having it already in the cache).

It would seem that these numbers show the ideal solution would vary the buffer size when the
engine determines that it is going to read a lot of sequential data (e.g. a wide open range
query), and use smaller buffer sizes when it expects only a few results.

Maybe this might shove Lucene down the path where the index is optimized so that common queries
terms are always put in a separate segment/index providing a high degree of locality to optimize
the reading. Maybe there is some academic research in this area?

Disk Test	81.23	
		Sequential	81.55	
			Uncached Write	80.69	33.63 MB/sec [4K blocks]
			Uncached Write	80.94	33.15 MB/sec [256K blocks]
			Uncached Read	77.68	12.30 MB/sec [4K blocks]
			Uncached Read	87.48	35.35 MB/sec [256K blocks]
		Random	80.92	
			Uncached Write	62.67	0.94 MB/sec [4K blocks]
			Uncached Write	89.93	20.28 MB/sec [256K blocks]
			Uncached Read	89.01	0.59 MB/sec [4K blocks]
			Uncached Read	89.93	18.51 MB/sec [256K blocks]

Disk Test	48.34	
		Sequential	47.83	
			Uncached Write	39.10	16.30 MB/sec [4K blocks]
			Uncached Write	59.73	24.46 MB/sec [256K blocks]
			Uncached Read	38.72	6.13 MB/sec [4K blocks]
			Uncached Read	64.56	26.08 MB/sec [256K blocks]
		Random	48.87	
			Uncached Write	35.51	0.53 MB/sec [4K blocks]
			Uncached Write	46.00	10.37 MB/sec [256K blocks]
			Uncached Read	66.61	0.44 MB/sec [4K blocks]
			Uncached Read	59.06	12.15 MB/sec [256K blocks]


> Increase buffer sizes used during searching
> -------------------------------------------
>
>                 Key: LUCENE-893
>                 URL: https://issues.apache.org/jira/browse/LUCENE-893
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions: 2.1
>            Reporter: Michael McCandless
>
> Spinoff of LUCENE-888.
> In LUCENE-888 we increased buffer sizes that impact indexing and found
> substantial (10-18%) overall performance gains.
> It's very likely that we can also gain some performance for searching
> by increasing the read buffers in BufferedIndexInput used by
> searching.
> We need to test performance impact to verify and then pick a good
> overall default buffer size, also being careful not to add too much
> overall HEAP RAM usage because a potentially very large number of
> BufferedIndexInput instances are created during searching
> (# segments X # index files per segment).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message