hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: Search query in Hbase
Date Wed, 24 Aug 2011 06:16:11 GMT
Hi Stuti,

one of the main design tasks in HBase is to structure the key space correctly.
HBase does not maintain tables in the relational sense but keeps *sorted* tuples of the form:
(row-key, column family name, column name, timestamp, value)

A table is nothing more than a key-space isolation.

So in order to find your information quickly, make user-name the row-key (or at least the
prefix of the row-key).
I.e. Store your information as (<user name>, <column family>, "email", ts, <email>).
The ts is system generated by default.

If each username can only exist exactly once (as you seem to imply) you can then simply
Get (http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html) the "row" by
the row-key (i.e. the username).

If not, you Scan (http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html)
starting with user-name as the start-key, and 

continue to scan until the user name changes. Alternatively you can set an end-key.

Both will be very efficient (because the data is sorted by what you are looking for).

Does this help?

-- Lars

From: Stuti Awasthi <stutiawasthi@hcl.com>
To: "user@hbase.apache.org" <user@hbase.apache.org>
Sent: Tuesday, August 23, 2011 10:36 PM
Subject: Search query in Hbase

Hi Friends,

I was wondering what could be the possible solution for various search options .

For example :

In user table we contain name and email of users. I want to check if the user name already
exist and if it does then I will not put it in database.

User table , info : name
                         Info: email

Could be a solution :

1)      I scan through all the user rows to get the name and apply the logic to match the
name with the existing name.
Personally I do not like this approach as for the huge user set this is quiet inefficient.

Is there some other way to perform this in Hbase ?



The contents of this e-mail and any attachment(s) are confidential and intended for the named
recipient(s) only.
It shall not attach any liability on the originator or HCL or its affiliates. Any views or
opinions presented in
this email are solely those of the author and may not necessarily reflect the opinions of
HCL or its affiliates.
Any form of reproduction, dissemination, copying, disclosure, modification, distribution and
/ or publication of
this message without the prior written consent of the author of this e-mail is strictly prohibited.
If you have
received this email in error please delete it and notify the sender immediately. Before opening
any mail and
attachments please check them for viruses and defect.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message