hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Tariq <donta...@gmail.com>
Subject Re: Rowkey design
Date Mon, 30 Nov 2015 23:21:35 GMT
Hi Marko,

Scan expects complete start and end row keys, IIRC. Order would anyway get
disturbed as you are salting your keys.


[image: http://]
Tariq, Mohammad
about.me/mti
[image: http://]
<http://about.me/mti>


On Mon, Nov 30, 2015 at 1:19 PM, Marko Dinic <hacker.marko@gmail.com> wrote:

> Hi Ted,
>
> Thank you for that information. Do you have some other suggestion, perhaps?
>
> Best regards,
> Marko
>
> On Monday, November 30, 2015, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > bq. duplicate data to two different tables, one with
> > (salt-productId-timestamp)
> > and other with (salt-productId-place) keys
> >
> > I suggest think twice about the above schema. It may become tricky
> keeping
> > data in the two tables in sync.
> > Meaning, when update to table1 succeeds but update to table2 fails, you
> > need to take additional action either retrying write to table2 or rolling
> > back update to table1.
> >
> > Cheers
> >
> > On Sun, Nov 29, 2015 at 2:19 PM, Marko Dinic <hacker.marko@gmail.com
> > <javascript:;>> wrote:
> >
> > > Hello, everyone!
> > >
> > > I'm new to HBase and I need help designing rowkeys for use case that
> > looks
> > > like this:
> > >
> > > - Products are listed, where each product has a product id.
> > > - Each product has a timestamp.
> > > - Each product is created in certain place (e.g. city)
> > > - Each product is created by some unit (e.g. factory)
> > >
> > > I would like to be able to scan products from a certain time period on
> > one
> > > hand, from a certain place, or from a certain unit.
> > >
> > > I read about salting to avoid hot-spotting and I understand that rows
> are
> > > sequential by rowkey. This will allow me to scan for a certain time
> > period
> > > using with following rowkey:
> > >
> > > salt-productId-timestamp
> > >
> > > And I can specify the period using STARTROW, ENDROW.
> > >
> > > What confuses me is how to include place (and maybe unit) into key and
> be
> > > able to select products from certain place during certain time period?
> > >
> > > If I limit myself to be able to scan by one of the above (time range OR
> > > place) I have an idea to duplicate data to two different tables, one
> with
> > > (salt-productId-timestamp) and other with (salt-productId-place) keys.
> Is
> > > that recommend or not?
> > >
> > > So, how to construct my keys?
> > >
> > > I should emphasize that i need this data to be input to MAPREDUCE JOB.
> > >
> > > Any help is greatly appreciated.
> > >
> > > --
> > > Best regards,
> > > Marko
> > >
> >
>
>
> --
> Marko Dinic
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message