hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Em <mailformailingli...@yahoo.de>
Subject Re: HBase (BigTable) many to many with students and courses
Date Tue, 29 May 2012 14:28:07 GMT

thanks for your help.
Yes, I know these slides.
However I can not find an answer to how to access such schemas efficiently.
In case of the given schema for students and courses as in those slides,
they say that each column contains the student's id / course's id.
However, when you want to build a GUI, you want to get all the courses
for a given student and display their names.
You *have* the column-names which represent the ids of the courses,
however to get the human readable name of a course, you have to access
the course-table.

I understand the schema, agree with it, but my question was how to
access this data efficiently within an application / how to implement
the needed behaviour efficiently.

Thanks! :)

Am 29.05.2012 12:49, schrieb shashwat shriparv:
> Check out this link may be it will help you somewhat:
> http://www.slideshare.net/hmisty/20090713-hbase-schema-design-case-studies
> On Tue, May 29, 2012 at 4:09 PM, Michel Segel <michael_segel@hotmail.com>wrote:
>> Depends...
>> Try looking at a hierarchical model rather than a relational model...
>> One thing to remember is that joins are expensive in HBase.
>> Sent from a remote device. Please excuse any typos...
>> Mike Segel
>> On May 28, 2012, at 12:50 PM, Em <mailformailinglists@yahoo.de> wrote:
>>> Hello list,
>>> I have some time now to try out HBase and want to use it for a private
>>> project.
>>> Questions like "How to I transfer one-to-many or many-to-many relations
>>> from my RDBMS's schema to HBase?" seem to be common.
>>> I hope we can throw all the best practices that are out there in this
>>> thread.
>>> As the wiki states:
>>> One should create two tables.
>>> One for students, another for courses.
>>> Within the students' table, one should add one column per selected
>>> course with the course_id besides some columns for the student itself
>>> (name, birthday, sex etc.).
>>> On the other hand one fills the courses table with one column per
>>> student_id besides some columns which describe the course itself (name,
>>> teacher, begin, end, year, location etc.).
>>> So far, so good.
>>> How do I access these tables efficiently?
>>> A common case would be to show all courses per student.
>>> To do so, one has to access the student-table and get all the student's
>>> courses-columns.
>>> Let's say their names are prefixed ids. One has to remove the prefix and
>>> then one accesses the courses-table to get all the courses and their
>>> metadata (name, teacher, location etc.).
>>> How do I do this kind of operation efficiently?
>>> The naive and brute force approach seems to be using a Get-object per
>>> course and fetch the neccessary data.
>>> Another approach seems to be using the HTable-class and unleash the
>>> power of "multigets" by using the batch()-method.
>>> All of the information above is theoretically, since I did not used it
>>> in code (I currently learn more about the fundamentals of HBase).
>>> That's why I give the question to you: How do you do this kind of
>>> operation by using HBase?
>>> Kind regards,
>>> Em

View raw message