hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Nguyen <anthony.an.ngu...@gmail.com>
Subject Re: Ramifications of minimizing use of .tmp directories / renames in HBase when using S3 as backing store
Date Thu, 10 Sep 2015 00:31:55 GMT
Thanks Matteo. If I understand correctly, one example of how the .tmp
directories help prevent issues is as follows: If HBase were to crash
during a compaction, since these .tmp directories are cleared out at start,
cleanup is much easier, right?

On Wed, Sep 9, 2015 at 7:31 PM, Matteo Bertozzi <theo.bertozzi@gmail.com>
wrote:

> hbase relies on .tmp directories to do some sort of "atomic" file creation.
> and avoid problems like half data written when it crashes.
>
> there is a jira open, to solve that problem in one of the next major
> releases:
> https://issues.apache.org/jira/browse/HBASE-14090
> There is a document in it, if you are interested in reading about the
> internals.
>
> Matteo
>
>
> On Wed, Sep 9, 2015 at 4:23 PM, Anthony Nguyen <
> anthony.an.nguyen@gmail.com>
> wrote:
>
> > Hi all,
> >
> > I'm investigating the use of S3 as a backing store for HBase. Would there
> > be any major issues with modifying HBase in such a way where when an S3
> > location is set for the rootdir, writes to .tmp are removed and
> minimized,
> > instead writing directly to the final destination? The reason I'd like to
> > do this is because renames in S3 are expensive and performance for
> > operations such as compactions and snapshot restores that have many
> renames
> > suffer.
> >
> > Thanks!
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message