hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jatinpreet <jatinpr...@gmail.com>
Subject Re: HBase entity relationship
Date Tue, 25 Nov 2014 05:38:12 GMT
Thanks Wilm, Let me try to explain my scenario in more detail. Let me talk
about two specific entities, Jobs and Sources.*Source-* A URL that is source
of some data. It also contains other meta-info like description, type etc.
So, the required columns are, source_name, url, description, type.*Job-* An
independent entity created with data from the selected sources. Apart from
job information, we need to keep a track of which sources were selected for
this job, and this list is editable, hence addition/removal are possible.
The columns needed in job are, job_name, description, source_{source-rowkey}
and so on.I was considering following options,1. Create a JSON of each
source and dump it into the value field of source_{timestamp} column. But I
need to be able to list all of the available sources before creating a job.
This would mean scanning all jobs and finding just the unique sources from
the all the lists. This seems like an overkill.Another problem with this
approach is that I would have to write my own custom filters if I need to
filter jobs on basis of source.2. Create a new table for sources and keep
the rowkeys of the sources in job rows. This turns out to be somewhat like
foreign keys thoguh which understandably is awkward for HBase. But now I
have the option of scanning the sources table for listing purposes. And this
is where my question originated. When I need to fetch sources for a
particular job I could just filter them based on job key column from source
table. This would mean a long scan on all rows of the source table.Another
option is, to fetch the list of source rowkeys from job row and then
directly hit the source table for these specific rowkeys. If this option
sustains, which of the above methods if more prudent.This example might not
seem to be based on huge data but I do expect millions of jobs to be
created. Also, this is a common pattern which I need to implement in other
parts of HBase tables too.Thanks,Jatin

View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-entity-relationship-tp4066296p4066326.html
Sent from the HBase User mailing list archive at Nabble.com.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message