hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Pelly <gfpe...@gmail.com>
Subject Re: Combining Multiple queries in Hbase
Date Tue, 06 Dec 2011 09:51:32 GMT

The book describes this well: http://ofps.oreilly.com/titles/9781449396107/

I'm just new to HBase so I don't want to say too much, though I probably
will. My answer to your main question ("I want the output from all these
queries in descending order as stored in Hbase") would be that you just
need to have a part of the row key with a descending value. I have the
following key structure.


where negtimestamp = Long.MAX_VALUE - the current time in milliseconds.

My design (or more accurately the book's key design) allows me to get all
messages for a user, with the negative timestamp allowing them to be listed
in reverse date order and the timestamp allowing responses to the original
message to be listed in the same space.

Note that every message is duplicated for every user that has access to the
message. That may sound inefficient to you but we're not dealing with a
relational database here, it doesn't matter how big it gets, if we're that
successful that we need to buy more hardware then we won't be worried. When
I was doing schema design for my app I read on the web that the design's
main driver should be how the data needs to be retrieved, that made my
decision for me, if I have all the messages listed for a user in the one
spot then the system just grabs them. If it needed to scan through for all
the messages that I have access to based on my account as well as the
groups I'm in then that would be messy code and many more scans.

You seem to want to be able to get a specific message, I don't need to do
that. I could think of a number of ways of doing it. I haven't looked too
much into secondary indexes, that's also covered in the book, that may be
your answer. You could also have a separate table with just the messages
themselves with only the <messageid> part of the row key as the key.

I also don't need to have a scan for messages for multiple userids, I don't
see where that would be necessary but I don't know your app. If it were all
messages I'd just have an admin user with membership of all groups so the
account gets sent all messages.

Sorry if I'm providing more questions than answers.


On Tue, Dec 6, 2011 at 3:59 PM, Stuti Awasthi <stutiawasthi@hcl.com> wrote:

> Hi All,
> Any suggesstions on this ??
> -----Original Message-----
> From: Stuti Awasthi
> Sent: Monday, December 05, 2011 4:55 PM
> To: hbase-user@hadoop.apache.org
> Subject: Combining Multiple queries in Hbase
> Hi all,
> I have some query related to combining multiple queries in Hbase.
> I want my data to be stored in decreasing order i.e latest entry should
> appear first in Hbase:
> Schema :
> Row Key                                 CF
> Timestamp-userid-<public/private>               Info
> Now I have to get data from 3 queries on this table :
> 1st Query :     *-myuserid-*
> 2nd Query:      *-X no of userids- *
> 3rd Query:      Full Row Key
> For the first Query, I can use RowFilter and get all the rows based on
> myuserid.
> I have problem with the rest 2 queries.:
> For the Second query , I will be having say 1000 userid, and I want each
> row from these userids Similarly for the 3rd query , I will be having say
> 1000 rowkey value and I want to have data where these rowkey are present.
> Now the main part: I want the output from all these queries in descending
> order as stored in Hbase.
> I have tried with FilterList, but issue with that is for 1000 userid, I
> will have to put 1000 filter object in FilterList. I am sure there should
> be some other way to solve this issue.
> Other problem is if I fire 3 separate queries then output of each of them
> will be different order by timestamp and then I have to manually arrange
> them in Timestamp order.
> How can I deal this problem efficiently.
> Please Suggest
> Regards,
> Stuti Awasthi
> HCL Comnet Systems and Services Ltd
> F-8/9 Basement, Sec-3,Noida.
> -----------------------------------------------------------------------------------------------------------------------
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only.
> It shall not attach any liability on the originator or HCL or its
> affiliates. Any views or opinions presented in
> this email are solely those of the author and may not necessarily reflect
> the opinions of HCL or its affiliates.
> Any form of reproduction, dissemination, copying, disclosure,
> modification, distribution and / or publication of
> this message without the prior written consent of the author of this
> e-mail is strictly prohibited. If you have
> received this email in error please delete it and notify the sender
> immediately. Before opening any mail and
> attachments please check them for viruses and defect.
> -----------------------------------------------------------------------------------------------------------------------

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message