hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Splitting an existing table with new keys.
Date Tue, 19 Aug 2014 20:49:54 GMT
Shahab:
How does your application deal with KeyValue whose value is empty ?

Can you insert rows with empty value whose keys correspond to the splits ?

Cheers


On Tue, Aug 19, 2014 at 1:29 PM, Shahab Yunus <shahab.yunus@gmail.com>
wrote:

> So the situation here is that we are trying to bulk load data in to a
> table. But each load of data has such range of keys that it will go to a
> specific continuous chunk of the region servers.
>
> In other other words, at each bulk load, we face hot-spotting but not at
> the end like the conventional case but it can be any where in between the
> row-key range of our table.
>
> Please note that the split point that I am trying to split on does not
> exist in the table yet. I am trying to prepare the existing table with
> data, by splitting into regions into which I will then bulk import my new
> data, to avoid hotspotting on one region server.
>
> The proof-of-concept code is below. Trying to split data into 16 regions
> ('0' to 'f' of the guid since each row in this current load shares the same
> value for the first 2 fields of the row key).
>
> Key is:
> data_source + time-in-long + 32-bytes-random-guid
>
> /*****/
>
> byte[][] splits = new byte[16][];
> byte[] dataSourceId = Bytes.toBytes(dataSource.getDataSourceID());
> byte[] loadTime = Bytes.toBytes(batchLoadTime);
> byte[] guidPrefix = null;
>
>   for(int i=0; i<splitPointsPrefixes.length; i++)  {
>
>    guidPrefix = Bytes.toBytes(splitPointsPrefixes[i]);
>    splits[i] = new byte[dataSourceId.length + loadTime.length + guidPrefix.
> length];
>    ByteBuffer splitBuffer = ByteBuffer.wrap(splits[i]);
>    splitBuffer.put(dataSourceId);
>    splitBuffer.put(loadTime);
>    splitBuffer.put(guidPrefix);
> }
>
> byte[] tableNameInBytes = Bytes.toBytes(tableName);
> HBaseAdmin admin = new HBaseAdmin(HBaseConfiguration.create(getConf()));
>
> for(byte[] split : splits)  {
>    //This is asynchronous. Should I wait here after each split to move onto
> next one?
>    admin.split(tableNameInBytes, split);
> }
> /*****/
>
> Regards,
> Shahab
>
>
> On Tue, Aug 19, 2014 at 4:13 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > Hi Shahab,
> >
> > can you sahre your code? Seems that the RS you reached did not have the
> > expected region. How is your table status in the web interface?
> >
> > JM
> >
> >
> > 2014-08-19 16:11 GMT-04:00 Shahab Yunus <shahab.yunus@gmail.com>:
> >
> > > I have a table already created and with some data. I want to split it
> > > trough code using HBaseAdmin api into multiple regions, while
> specifying
> > > keys that do not exist in the table.
> > >
> > > I am getting the exception below which makes sense because the key
> > doesn't
> > > exist yet. But at the time of creation of the table we can indeed
> > pre-split
> > > it using keys that don't exist.
> > >
> > > Is it possible to do it for table that already exists and has data?
> > >
> > > *Caused by:
> > >
> > >
> >
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException):
> > > org.apache.hadoop.hbase.NotServingRegionException: *
> > >
> > >
> > > Using Hbase: 0.98.1-cdh5.1.0
> > >
> > > Thanks a lot.
> > >
> > > Regards,
> > > Shahab
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message