hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Baranau <alex.barano...@gmail.com>
Subject Re: [ANN]: HBaseWD: Distribute Sequential Writes in HBase
Date Thu, 21 Apr 2011 12:45:42 GMT
> Basically bucketsCount may not equal number of regions for the underlying
> table.

True: e.g. when there's only one region that holds data for the whole table
(not many records in table yet), distributed scan will fire N scans against
the same region.
On the other hand, in case there are huge number of regions for single
table, each scan can span over multiple regions.

> I need to deal with normal scan and "distributed scan" at server side.

With current implementation "distributed" scan won't be recognized as
something special on the server side. It will be an ordinary scan. Though
the number of scan will increase, given that the typical situation is "many
regions for single table", the scans of the same "distributed scan" are
likely not to hit the same region.

Not sure if I answered your questions here. Feel free to ask more ;)

Alex Baranau
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase

On Wed, Apr 20, 2011 at 2:10 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Alex:
> If you read this, you would know why I asked:
> https://issues.apache.org/jira/browse/HBASE-3679
>
> I need to deal with normal scan and "distributed scan" at server side.
> Basically bucketsCount may not equal number of regions for the underlying
> table.
>
> Cheers
>
> On Tue, Apr 19, 2011 at 11:11 PM, Alex Baranau <alex.baranov.v@gmail.com
> >wrote:
>
> > Hi Ted,
> >
> > We currently use this tool in the scenario where data is consumed by
> > MapReduce jobs, so we haven't tested the performance of pure "distributed
> > scan" (i.e. N scans instead of 1) a lot. I expect it to be close to
> simple
> > scan performance, or may be sometimes even faster depending on your data
> > access patterns. E.g. in case you write timeseries data (sequential)
> which
> > is written into the single region at a time, then e.g. if you access
> delta
> > for further processing/analysis (esp. if from not single client) these
> > scans
> > are likely to hit the same region or couple of regions at a time, which
> may
> > perform worse comparing to many scans hitting data that is much better
> > spread over region servers.
> >
> > As for map-reduce job the approach should not affect reading performance
> at
> > all: it's just that there are bucketsCount times more splits and hence
> > bucketsCount times more Map tasks. In many cases this even improves
> overall
> > performance of the MR job since work is better distributed over cluster
> > (esp. in situation when the aim is to constantly process the coming delta
> > which usually resides in one or just couple of regions depending on
> > processing frequency).
> >
> > If you can share details on your case, that will help to understand what
> > effect(s) to expect from using this approach.
> >
> > Alex Baranau
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
> HBase
> >
> > On Wed, Apr 20, 2011 at 8:17 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > Interesting project, Alex.
> > > Since there're bucketsCount scanners compared to one scanner
> originally,
> > > have you performed load testing to see the impact ?
> > >
> > > Thanks
> > >
> > > On Tue, Apr 19, 2011 at 10:25 AM, Alex Baranau <
> alex.baranov.v@gmail.com
> > > >wrote:
> > >
> > > > Hello guys,
> > > >
> > > > I'd like to introduce a new small java project/lib around HBase:
> > HBaseWD.
> > > > It
> > > > is aimed to help with distribution of the load (across regionservers)
> > > when
> > > > writing sequential (becasue of the row key nature) records. It
> > implements
> > > > the solution which was discussed several times on this mailing list
> > (e.g.
> > > > here: http://search-hadoop.com/m/gNRA82No5Wk).
> > > >
> > > > Please find the sources at
> https://github.com/sematext/HBaseWD(there's
> > > > also
> > > > a jar of current version for convenience). It is very easy to make
> use
> > of
> > > > it: e.g. I added it to one existing project with 1+2 lines of code
> (one
> > > > where I write to HBase and 2 for configuring MapReduce job).
> > > >
> > > > Any feedback is highly appreciated!
> > > >
> > > > Please find below the short intro to the lib [1].
> > > >
> > > > Alex Baranau
> > > > ----
> > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
> > > HBase
> > > >
> > > > [1]
> > > >
> > > > Description:
> > > > ------------
> > > > HBaseWD stands for Distributing (sequential) Writes. It was inspired
> by
> > > > discussions on HBase mailing lists around the problem of choosing
> > > between:
> > > > * writing records with sequential row keys (e.g. time-series data
> with
> > > row
> > > > key
> > > >  built based on ts)
> > > > * using random unique IDs for records
> > > >
> > > > First approach makes possible to perform fast range scans with help
> of
> > > > setting
> > > > start/stop keys on Scanner, but creates single region server
> > hot-spotting
> > > > problem upon writing data (as row keys go in sequence all records end
> > up
> > > > written into a single region at a time).
> > > >
> > > > Second approach aims for fastest writing performance by distributing
> > new
> > > > records over random regions but makes not possible doing fast range
> > scans
> > > > against written data.
> > > >
> > > > The suggested approach stays in the middle of the two above and
> proved
> > to
> > > > perform well by distributing records over the cluster during writing
> > data
> > > > while allowing range scans over it. HBaseWD provides very simple API
> to
> > > > work with which makes it perfect to use with existing code.
> > > >
> > > > Please refer to unit-tests for lib usage info as they aimed to act as
> > > > example.
> > > >
> > > > Brief Usage Info (Examples):
> > > > ----------------------------
> > > >
> > > > Distributing records with sequential keys which are being written in
> up
> > > to
> > > > Byte.MAX_VALUE buckets:
> > > >
> > > >    byte bucketsCount = (byte) 32; // distributing into 32 buckets
> > > >    RowKeyDistributor keyDistributor =
> > > >                           new
> > > > RowKeyDistributorByOneBytePrefix(bucketsCount);
> > > >    for (int i = 0; i < 100; i++) {
> > > >      Put put = new
> Put(keyDistributor.getDistributedKey(originalKey));
> > > >      ... // add values
> > > >      hTable.put(put);
> > > >    }
> > > >
> > > >
> > > > Performing a range scan over written data (internally <bucketsCount>
> > > > scanners
> > > > executed):
> > > >
> > > >    Scan scan = new Scan(startKey, stopKey);
> > > >    ResultScanner rs = DistributedScanner.create(hTable, scan,
> > > > keyDistributor);
> > > >    for (Result current : rs) {
> > > >      ...
> > > >    }
> > > >
> > > > Performing mapreduce job over written data chunk specified by Scan:
> > > >
> > > >    Configuration conf = HBaseConfiguration.create();
> > > >    Job job = new Job(conf, "testMapreduceJob");
> > > >
> > > >    Scan scan = new Scan(startKey, stopKey);
> > > >
> > > >    TableMapReduceUtil.initTableMapperJob("table", scan,
> > > >      RowCounterMapper.class, ImmutableBytesWritable.class,
> > Result.class,
> > > > job);
> > > >
> > > >    // Substituting standard TableInputFormat which was set in
> > > >    // TableMapReduceUtil.initTableMapperJob(...)
> > > >    job.setInputFormatClass(WdTableInputFormat.class);
> > > >    keyDistributor.addInfo(job.getConfiguration());
> > > >
> > > >
> > > > Extending Row Keys Distributing Patterns:
> > > > -----------------------------------------
> > > >
> > > > HBaseWD is designed to be flexible and to support custom row key
> > > > distribution
> > > > approaches. To define custom row key distributing logic just
> implement
> > > > AbstractRowKeyDistributor abstract class which is really very simple:
> > > >
> > > >    public abstract class AbstractRowKeyDistributor implements
> > > > Parametrizable {
> > > >      public abstract byte[] getDistributedKey(byte[] originalKey);
> > > >      public abstract byte[] getOriginalKey(byte[] adjustedKey);
> > > >      public abstract byte[][] getAllDistributedKeys(byte[]
> > originalKey);
> > > >      ... // some utility methods
> > > >    }
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message