hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Ramifications of minimizing use of .tmp directories / renames in HBase when using S3 as backing store
Date Thu, 10 Sep 2015 22:22:33 GMT
> Consistency issues: Since S3 has read after write consistency for new
​> ​
objects

​Eventually. The problem is reads for the new objects may fail for some
arbitrary time first, with 404 or 500 responses as I mentioned before. ​

​> ​
Appends: An append in S3 can be modeled as a read / copy / delete
​ ​
operation. Performance will definitely suffer, but we can configure WAL
​
settings to perhaps roll the WAL very often to minimize appends, right?

​No, every single mutation sent to HBase is committed to the WAL first
(with an append then a flush) and only then do we return to the client. If
you batch edits before writing this abandons HBase's durability guarantees.
If you write exactly one edit per WAL this will produce terrible
performance. A client may be able to write a few values to HBase per second.

S3 storage isn't workable.



On Wed, Sep 9, 2015 at 5:55 PM, Anthony Nguyen <anthony.an.nguyen@gmail.com>
wrote:

> Hi Andrew,
>
> Thanks for your help! I'm attempting to give it Hbase on S3 another try :).
> Let me see if I can address the two things you pointed out and you can call
> me crazy for even thinking that they'd be okay:
>
> First, some background - my initial use case will be using HBase in a
> mostly read-only fashion (bulk loads are the primary method of data load).
>
> Consistency issues: Since S3 has read after write consistency for new
> objects, we could potentially structure the naming scheme within HBase file
> writes such that the same location is never overwritten. In this way, any
> new files (which would include updates / appends to an ultimately new file)
> in S3 will be immediately consistent. Files can be cleaned up after a set
> period of time.
>
> Appends: An append in S3 can be modeled as a read / copy / delete
> operation. Performance will definitely suffer, but we can configure WAL
> settings to perhaps roll the WAL very often to minimize appends, right?
>
> Atomic folder renames can also probably be worked around too, similar to
> how hadoop-azure does it.
>
> Thanks again. Looking forward to your insight.
>
> On Wed, Sep 9, 2015 at 7:32 PM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
> > It cannot work to use S3 as a backing store for HBase. This has been
> > attempted in the past (although not by me, so this isn't firsthand
> > knowledge). One basic problem is HBase expects to be able to read what it
> > has written immediately after the write completes. For example, opening a
> > store file after a flush to service reads, or reopening store files
> after a
> > compaction. S3's notion of availability is too eventual. You can get 404
> or
> > 500 responses for some time after completing a file write. HBase will
> > panic. I'm not sure we'd accept kludges around this problem. Another
> > fundamental problem is S3 doesn't provide any means for appending to
> files,
> > so the HBase write ahead log cannot work. I suppose this could be worked
> > around with changes that allow specification of one type of filesystem
> for
> > HFiles and another for write ahead logs. There are numerous finer points
> > about HDFS behavior used as synchronization primitive, such as atomic
> > renames.
> >
> >
> > On Wed, Sep 9, 2015 at 4:23 PM, Anthony Nguyen <
> > anthony.an.nguyen@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > I'm investigating the use of S3 as a backing store for HBase. Would
> there
> > > be any major issues with modifying HBase in such a way where when an S3
> > > location is set for the rootdir, writes to .tmp are removed and
> > minimized,
> > > instead writing directly to the final destination? The reason I'd like
> to
> > > do this is because renames in S3 are expensive and performance for
> > > operations such as compactions and snapshot restores that have many
> > renames
> > > suffer.
> > >
> > > Thanks!
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message