hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagane Sundar <jag...@sundar.org>
Subject RE: Writing the WAL to a different filesystem from the HFiles
Date Wed, 28 Dec 2011 20:02:08 GMT
Hello Andy,

>> No, definitely not full object reads, we use HDFS positioned reads, which allow us
to request, within a gigabyte plus store file, much smaller byte >> ranges (e.g. 64
KB), and receive back only the requested data. We can "seek" around the file.

Ahh. This is good to know. HTTP range requests should work for this mode of operation. I will
take a look at Hadoop's S3 FileSystemStore implementation and see if it uses HTTP range requests.

>> Aside from several IMHO showstopper performance problems, the shortest answer is
HBase often wants to promptly read back store files it has
>> written, and S3 is too eventual often enough (transient 404s or 500s) to preclude
reliable operation.

Hmm. OK. The potential performance problems are worrisome.

Improvements in Hadoop's S3 client, and in the implementation of S3 itself could help to fix
throughput problems and mask transient error problems. There are rumors of a version of the
Hadoop S3 client implementation that use parallel reads to greatly improve throughput.

Andy - are you (or other HBase experts) aware if HBase would have problems with a HFile store
that exhibits variable latency? Specifically, what about scenarios where most HFile reads
come back in milliseconds, but suddenly there is one that takes a few hundred milliseconds
(or more).


View raw message