hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hegner, Travis" <THeg...@trilliumit.com>
Subject RE: How to search and make indexes in ColumnFamilies with unknown columns ?
Date Thu, 24 Jun 2010 13:58:20 GMT
I'm not an expert by any means, but I wonder if you were to store the course name/type as the
column name, and some arbitrary but useful value as the value, for example:

Student_Courses  // Table Name
     Student:   // Column Family
          ID => 12345678
          Name => John Smith

     Courses:   // Column Family with any number of columns:
         Maths => 2010_Fall
         Computer => 2011_Spring
         Science => 2011_Spring

The API may be better suited to handle filtering by column name, rather than value, but as
I said, I'm no expert, and I have very little experience filtering via the API.

Assuming the filter works correctly, you could simply ignore the value retrieved if it wasn't
needed. Be careful about putting too large of a value in though, as that could affect performance.
This is one of the beauties of a column oriented schema, you can store useful, valuable information
as a column name.

I do know that with this type of schema, the columns would be accessed like:

get(<row_id>, "Courses:Maths"[, <version>]);

or something to that effect anyway...

Hope This Helps, Good Luck!

Travis Hegner

-----Original Message-----
From: SyedShoaib [mailto:shoaib_talib@hotmail.com]
Sent: Thursday, June 24, 2010 8:26 AM
To: hbase-user@hadoop.apache.org
Subject: How to search and make indexes in ColumnFamilies with unknown columns ?


I am new to HBase and have just worked on it for few days. I have two
questions. Any kind of help is fully appreciated and many thanks in advance.

1) Suppose I have a columnFamily with unknown number of columns. I want to
search a value in this columnFamily. That value can be present in any column
of this columnFamily. How will I search a value in whole columnFamily? For
further elaboration please consider a simple scenario:

For example: A student can have any number of courses. Schema in HBase could

Student_Courses  // Table Name
     Student:   // Column Family

     Courses:   // Column Family with any number of columns:
         Course_1:  Maths
         Course_2:  Computer
         Course_n:  Science

If I want to search all rows with a value “Maths” in any of the column
inside columnFamily “Course:” what will I do ? I can search for any value
through SingleColumnValueFilter  by mentioning ColumnFamily and Prefix e.g.
"Student:Name". But how will I search a value in "Course:" columnFamily
keeping the fact in mind that I dont know how many columns I have in it.

2) How will I make an index on this columnFamily (“Course:”) ? I know
indexes are made on columns but the columns are unknown in number!  I can
make an index on "Student:Name". But what to do if I want to make a single
index on complete “Courses:” ColumnFamily? Is it possible? It will help me a


View this message in context: http://old.nabble.com/How-to-search-and-make-indexes-in-ColumnFamilies-with-unknown-columns---tp28981932p28981932.html
Sent from the HBase User mailing list archive at Nabble.com.

The information contained in this communication is confidential and is intended only for the
use of the named recipient.  Unauthorized use, disclosure, or copying is strictly prohibited
and may be unlawful.  If you have received this communication in error, you should know that
you are bound to confidentiality, and should please immediately notify the sender or our IT
Department at  866.459.4599.
View raw message