hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Neumann <neun...@gmail.com>
Subject hadoop without append in the absence of puts
Date Wed, 22 Jun 2011 21:58:45 GMT
I am changing the subject to reflect the discussion...

If we only load data in bulk (that is, via doBulkLoad(), not using
TableOutputFormat), do we still risk data loss? My understanding is that
append is needed for the WAL, and the WAL is needed only for puts. But bulk
loads bypass the WAL.

For instance, when a region is split, the master must write the new meta
data to the meta regions. Would that require a WAL or rely on append in some
other way?

Are there other situations where the WAL is needed (or append is needed) to
avoid data loss?

Thanks -Andreas.


On Tue, Jun 21, 2011 at 2:29 PM, Andrew Purtell <apurtell@apache.org> wrote:

> > From: Francis Christopher Liu <fcliu@yahoo-inc.com>
> > Thanks for the warning, we'd like to stick with the ASF
> > releases of hadoop.
>
> That's not really advisable with HBase. It's a touchy subject, the 0.20-ish
> support for append in HDFS exists in production at some large places but
> isn't in any ASF 0.20.
>
> We have replayed the append branch on top of branch-0.20-security:
>  https://github.com/trendmicro/hadoop-common/tree/0.20-security-append
>
> This is a companion ASF-ish branch to our in development branch of HBase
> for HBASE-3025 (security and access control):
>  http://github.com/trendmicro/hbase/tree/security
>
> If you're willing to risk data loss with HBase anyway you should give this
> a shot. We really need a branch-0.20-security-append.
>
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message