hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Em <mailformailingli...@yahoo.de>
Subject HBase (BigTable) many to many with students and courses
Date Mon, 28 May 2012 16:51:08 GMT
Hello list,

I have some time now to try out HBase and want to use it for a private
project.

Questions like "How to I transfer one-to-many or many-to-many relations
from my RDBMS's schema to HBase?" seem to be common.

I hope we can throw all the best practices that are out there in this
thread.

As the wiki states:
One should create two tables.
One for students, another for courses.

Within the students' table, one should add one column per selected
course with the course_id besides some columns for the student itself
(name, birthday, sex etc.).

On the other hand one fills the courses table with one column per
student_id besides some columns which describe the course itself (name,
teacher, begin, end, year, location etc.).

So far, so good.

How do I access these tables efficiently?

A common case would be to show all courses per student.

To do so, one has to access the student-table and get all the student's
courses-columns.
Let's say their names are prefixed ids. One has to remove the prefix and
then one accesses the courses-table to get all the courses and their
metadata (name, teacher, location etc.).

How do I do this kind of operation efficiently?
The naive and brute force approach seems to be using a Get-object per
course and fetch the neccessary data.
Another approach seems to be using the HTable-class and unleash the
power of "multigets" by using the batch()-method.

All of the information above is theoretically, since I did not used it
in code (I currently learn more about the fundamentals of HBase).

That's why I give the question to you: How do you do this kind of
operation by using HBase?

Kind regards,
Em

Mime
View raw message