hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piyush Goel <piyushgoe...@gmail.com>
Subject help needed with base schema
Date Mon, 13 Jul 2009 10:31:01 GMT
Hi,

I am trying to design a high scale key value storage system. The hbase table
for the same is outlined below:

{
  "userid1" : {
    "update" : {
        t3 : "some update1",
        t2 : "some update2",
        t1 : "some update3"
    },
    "sender" : {
        t3 : "sender3"
        t2 : "sender2"
        t1 : "sender1"
    },

  "userid2" : {
    "update" : {
        t9 : "some update9",
        t6 : "some update534",
        t1 : "some update343"
    },
    "sender" : {
        t9 : "sender3"
        t6 : "sender2"
        t1 : "sender1"
    },


}

The system is going to have around 15-20M users with around 3-4M put write
operations per day (which rules out mysql automatically). The max number of
entries in "update" and "sender" columns  will be around 1000 (around 1
weeks updates)

My queries would be like "For a given userid, return top 20 updates, senders
based on timestamp". Is there a way to make a secondary index on "userid,
timestamp" which can help speed up my "get" calls? Or how can I change my
schema design to minimize response time for get calls ?


Regards,

Piyush Goel
Software Engineer
Yahoo! Software Development India Pvt. Ltd.
Bangalore, India
Ph : +91 80 66949816 (O)
            9980616752  (M)

If you're not failing every now and again, it's a sign you're not doing
anything very innovative.  - Woody Allen

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message