hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: HDFS-1599 status? (HDFS tickets to improve HBase)
Date Fri, 03 Jun 2011 23:57:42 GMT
An hdfs-347 that checksums is over in a the hadoop branch that fb
published over on github (Dhruba and Jon pointed me at it); i've been
meaning to put the patch up in the hdfs-347 issue.


On Fri, Jun 3, 2011 at 4:42 PM, Jason Rutherglen
<jason.rutherglen@gmail.com> wrote:
> I think one'd need to checksum only once on the first file system
> instantiation, or first access of the file?  As mentioned in
> HDFS-2004, HBase's usage of HDFS is outside of the initial design
> motivation.  Eg, the rules may need to be bent in order to enable
> performant use of HBase with HDFS.  The idea of working with HDFS at
> the block level becomes [likely] more important.
> On Fri, Jun 3, 2011 at 3:57 PM, Kihwal Lee <kihwal@yahoo-inc.com> wrote:
>> When I tried HDFS-941, the new bottleneck was checksum. So the performance may drop
significantly if checksum is added and enabled in HDFS-347.
>> Kihwal
>> On 6/3/11 5:46 PM, "Andrew Purtell" <apurtell@apache.org> wrote:
>> Yes, and though I have patches, and I'm happy to provide them if you want...
>> Indeed, 347 doesn't do security or checksums so needs work to say the least. We use
it with HBase given a privileged role such that it shares group-readable DFS data directories
with the DataNodes. It works for us, though checksumming is on the to do list.
>> And I agree 947 is scary. However I did pull the last incarnation of 947 attached
to the jira into CDH3U0 for some ongoing testing with real load, combined with 918, which
we did put into production.
>>   - Andy
>> --- On Fri, 6/3/11, Todd Lipcon <todd@cloudera.com> wrote:
>>> From: Todd Lipcon <todd@cloudera.com>
>>> Subject: Re: HDFS-1599 status? (HDFS tickets to improve HBase)
>>> To: dev@hbase.apache.org
>>> Date: Friday, June 3, 2011, 1:09 PM
>>> On Fri, Jun 3, 2011 at 12:50 PM, Doug
>>> Meil
>>> <doug.meil@explorysmedical.com>
>>> wrote:
>>> > Thanks everybody for commenting on this thread.
>>> >
>>> > We'd certainly like to lobby for movement on these two
>>> tickets, and although we don't have anybody that is familiar
>>> with the source code we'd be happy to perform some tests get
>>> some performance numbers.
>>> >
>>> > Per Kihwal's comments, it sounds like HDFS-941 needs
>>> to get re-worked because the patch is stale.
>>> >
>>> Yes - bc Wong, the originally contributor, works with me but on
>>> unrelated projects. HDFS-941 was something he did as part of a
>>> "hackathon" but only gets occasional time to circle back on it. As we
>>> last left it, there were just a few things that had to be addressed.
>>> If someone wants to finish it up, and volunteer to test it under some
>>> real load, I'd be happy to review and commit.
>>> > The patch for HDFS-347 sounds like it's still usable.
>>> The current patch for 347 is unworkable since it doesn't do
>>> checksums or security. The FD-passing approach was working at some
>>> point but basically needs to be re-done on trunk.
>>> I think doing HDFS-941 and HDFS-918 first is best, then more drastic
>>> things like 347 can be considered.

View raw message