hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Silberstein <a...@trifacta.com>
Subject HDFS seek perf question
Date Thu, 29 Jan 2015 01:33:20 GMT
Hi,
I have a question about hdfs seek performance.  I see some info on this
periodically, but nothing too recent.

How do these costs compare?
A) seeking to the start of an HDFS block and reading about 10MB of data
B) reading the entire HDFS block

Assuming A is faster, how many random seeks can you do against an HDFS
block before that is slower than reading the whole thing?  On paper that
can be computed using the disk's speed numbers but would like to know how
well in practice HDFS matches that behavior.

Thanks,
Adam

Mime
View raw message