hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghava Mutharaju <m.vijayaragh...@gmail.com>
Subject performance consideration when writing to HBase from MR job
Date Sat, 05 Jun 2010 22:44:45 GMT
Hi all,

    If HBase is used as the data sink in an MR job, would there be a
performance improvement if a) is done instead of b)

a) all the Puts are collected in Reduce or Map (if there is no reduce)  and
a batch write is done
b) writing out each <K,V> pair using context.write(k, v)

If a) is considered instead of b) then wouldn't there be a violation of
semantics w.r.t KEYOUT, VALUEOUT (because <K, V> is not being output)?? Is
this OK?

Thank you.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message