hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: [Schema] Put or Increment ?
Date Tue, 25 Sep 2012 17:47:13 GMT
Increment is slightly more expensive, since the RegionServer executing the Increment needs
to retrieve the old value(s) first (while holding the row lock).

-- Lars

----- Original Message -----
From: Shrijeet Paliwal <shrijeet@rocketfuel.com>
To: user@hbase.apache.org
Sent: Tuesday, September 25, 2012 10:02 AM
Subject: Re: [Schema] Put or Increment ?

On Tue, Sep 25, 2012 at 9:56 AM, Pamecha, Abhishek <apamecha@x.com> wrote:

> Hi Shrijeet
> What's your usecase? That should drive your decision. Put will overwrite
> in case your userid and ip address is same. Increment would just bump up
> the counter.

#1 Keep a list of distinct IPs
#2 Counts per IP (only if comes cheap)
#3 Do blind writes (instead of read-modify-write)

Given #3 , overwrite is okay. My question is about #2, if the cost is
trivial I will use increment.

> -abhishek
> -----Original Message-----
> From: Shrijeet Paliwal [mailto:shrijeet@rocketfuel.com]
> Sent: Tuesday, September 25, 2012 9:35 AM
> To: user@hbase.apache.org
> Subject: [Schema] Put or Increment ?
> Hi,
> Suppose I am tracking user activity by storing his IP each time he hits
> the web service. The row id will be uid of user and column qualifiers will
> be IPs themselves. I am contemplating whether to use a Put or Increment API.
> The must have requirement is distinct IPs associated with the user. It
> will be good to have count of visits per IP, but not having the count is OK
> too (if its expensive). Please help me compare the performance of increment
> vs put in this context. Will I see better throughput using one over other?
> Better space utilization? What else?
> -Shrijeet

View raw message