hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@facebook.com>
Subject RE: HBase: project ideas
Date Thu, 19 Aug 2010 19:26:36 GMT
Himanshu,

Seems like you might have an interest in using Coprocessors to do stuff like low-latency aggregates.
 This is a big area of interest for some of us but not a lot of concerted effort in this direction
yet.  There is plenty to do here for a research project.

Check out:

https://issues.apache.org/jira/browse/HBASE-2000

And specifically:

https://issues.apache.org/jira/browse/HBASE-1512

JG

> -----Original Message-----
> From: Himanshu Vashishtha [mailto:vashishtha.h@gmail.com]
> Sent: Thursday, August 19, 2010 11:30 AM
> To: dev@hbase.apache.org
> Cc: user@hbase.apache.org
> Subject: Re: HBase: project ideas
> 
> Hello Stack,
> Thanks for the reply. please see inline.
> 
> Cheers,
> Himanshu
> 
> On Thu, Aug 19, 2010 at 11:22 AM, Stack <stack@duboce.net> wrote:
> 
> > On Thu, Aug 19, 2010 at 2:47 AM, Himanshu Vashishtha
> > <vashishtha.h@gmail.com> wrote:
> > > Dear All:
> > > I have been looking around HBase (running/debugging it, etc) for a
> couple
> > of
> > > weeks now, and it is fascinating. I am in search of a good project
> for my
> > > grad studies, focussing around HBase, but am not able to finalize
> it. I
> > am
> > > looking for some project idea that I can use. It can be user or a
> dev
> > > project, I am open to all :)
> > >
> > > One idea (user specific) is to migrate a XQuery like tool that uses
> > > relational db schema (there are bunch of papers suggesting it) to
> HBase,
> > but
> > > I don't sure whether it is really a judicial use of HBase. Please
> > suggest.
> > >
> > >
> >
> > Hello Himanshu.
> >
> > Its hard to make suggestion when I've no clue as to your interests.
> >
> Hadoop fascinates me. I wrote a tool for my lab which indexes a given
> document collection (of plain text files) and then user can query it
> from
> four predefined operations... I store those indexes on HDFS using
> Mapfiles(to reduce the request-response latency).
> 
> Can you cite some of the papers you mention?
> > So, I want to carry it forward for XML, and I came across two
> approaches:
> > indexing the doc, OR storing them in a rdbms style while also
> considering
> > schema info.
> >
> Paper ( for index based approach): An efficient inverted index
> technique for
> XML documents using RDBMS, Chiyoung Seo, others..2003.
> 
> and for rdbms approach: *A Comprehensive XQuery* to *SQL* Translation
> using
> Dynamic Interval Encoding. David DeHaan, David Toman, Mariano P.
> Consens,
> others... in 2003, and its references.
> 
> I developed a prototype for the index based one in HBase, but it is
> limited
> in usage (due to its inherent approach of indexing, you can't fire
> elegant
> operations like summing, grouping etc). Its quite raw.
> 
>  + Have you looked at HIVE?  It might be more pertinent making this run
> > better atop hbase rather than making a new XQuery-like tool for
> hbase.
> >
> 
> Not yet. I read that it runs a MR job for every query, and it kind of
> slows
> its response time, so I skipped it past. But yes, it does provides lot
> of
> relational schema stuff I see.
> 
> > + Build an app that allows various kind of location queries using
> > geohashing+hbase combo.  There's a few fellas floating on the list
> who
> > might be able to help you out on this project.
> >
> > For extra points, whatever you do, build it using hbase-2000
> coprocessors.
> >   I am sorry I couldn't get this.
> >
> 
> 
> > Thanks for writing the list Himanshu.
> > St.Ack
> >

Mime
View raw message