hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@facebook.com>
Subject RE: Availability Transaction and data integrity
Date Mon, 17 May 2010 15:50:44 GMT
Answers inline.

> -----Original Message-----
> From: Imran M Yousuf [mailto:imyousuf@gmail.com]
> Sent: Monday, May 17, 2010 8:14 AM
> To: hbase-user@hadoop.apache.org
> Subject: Availability Transaction and data integrity
> 
> Hi,
> 
> Currently we are designing an architecture for a Accounting SaaS and
> e-commerce website. As both of them will store financial data -
> transaction, redundancy, HA and data integrity is very important. As I
> am not a master of HBase architecture and implementation I am eagerly
> waiting for your comments on as follows:
> 
> * We will go live from January 2011, in that time frame should we
> develop using 0.21-SNAPSHOT or should we stick to 0.20.x? Ideally I
> would not want to go ahead with a snapshot in production and also
> would not want to make an upgrade within few months (because of some
> problems noticed in the mailing list regarding upgrade and I am a bit
> skeptical about it in general).

If you need data durability (no data loss under node failure) then you have no choice but
to go with 0.21 once it is released.  This is not supported on the 0.20 line.

There are a number of organizations who will be going live into production on 0.21 in Q3 2010.
 You can be sure that there will be a very well tested and stable 0.21 release by January
2011.

> * Transaction was a contrib module of HBase but it seems recently
> removed from the 0.21-SNAPSHOT. In light of it what would be the way
> to achieve transaction?

It is still available but is being moved to GitHub.  You can still use it, it has just been
moved out of the core code.

> * NN was (if I am not mistaken) a SPoF, I also learnt that its
> supposed to be fixed in 0.21, is that in trunk already?

I'm not sure where you heard this was fixed in 0.21 as my understanding is that it is not
fixed in 0.21.

There is work being done at Facebook (and I believe parallel work being done elsewhere) to
add a true backup NameNode.  Once stabilized this will be released and available to the public
though it may not be put into an official Hadoop release in an 0.21 timeframe.

> * What kind of data loss should we design to?

On 0.21, you should not have data loss.  We are doing a lot of testing on this to ensure stability
and durability.

> * Is there any professional service provider who could help us train
> for deployment, help optimize and in case we need emergency provide
> service? (P.S. I contacted Cloudera via email 11 days back and still
> waiting for a reply, may be they are not interested any alternate
> would do great!)

Cloudera should be able to help.  They're active on this list but perhaps don't want to use
this forum to sell services.  I would ping them again off the list.

> 
> I eagerly hope for some help and guideline on these queries.
> 
> --
> Imran M Yousuf
> Entrepreneur & Software Engineer
> Smart IT Engineering
> Dhaka, Bangladesh
> Email: imran@smartitengineering.com
> Blog: http://imyousuf-tech.blogs.smartitengineering.com/
> Mobile: +880-1711402557

Mime
View raw message