hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Occasional regionserver crashes following socket errors writing to HDFS
Date Thu, 10 May 2012 21:57:20 GMT
On Thu, May 10, 2012 at 11:59 AM, Michael Segel
<michael_segel@hotmail.com> wrote:
> Sigh.
> Dave,
> I really think you need to think more about the problem.
> Think about what a reduce does and then think about what happens in side of HBase.
> Then think about which runs faster... a job with two mappers writing the intermediate
and final results in HBase,
> or a M/R job that writes its output to HBase.
> If you really truly think about the problem, you will start to understand why I say you
really don't want to use a reducer when you're working w HBase.

We have a bit of doc that usually you might want to forego reduce
phase, http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#sink.
 Do we need to add to it?  That said, you can't make an hard and fast
rule that the reduce is to be avoided absolutely.  There will be cases
where it makes sense (MR sort orthogonal to HBase's or a fat
aggregating reduce, etc.)

P.S. Hey Michael.  Go easy on the 'sighs'.  The participants in this
thread have a clue.  I can testify to that.  Also, I know you don't
mean it, but on occasion, both in this thread and in others I've seen
you on, your tone can come across as condescending (and there is
nothing like condescension for raising the rankles).  We all have our
style's but you might want to review with this in mind before you hit
send the next time.  Just a suggestion.

View raw message