hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Using SPARQL against HBase
Date Mon, 05 Apr 2010 05:50:27 GMT
Hi, I'm a proposer/sponsor of heart project.

I have no doubt that RDF can be stored in HBase because google also
stores linked-data in their bigtable.

However, If you want to focus on large-scale (distributed) processing,
I would recommend you to read google pregel project (google's graph
computing framework). because the SPARQL is a basically graph query
language for RDF graph data.

On Fri, Apr 2, 2010 at 7:09 AM, Jürgen Jakobitsch <jakobitschj@punkt.at> wrote:
> hi again,
>
> i'm definitly interested.
>
> you probably heard of the heart project, but there's hardly something going on,
> so i think it's well worth the effort.
>
> for your discussion days i'd recommend taking a look at openrdf sail api
>
> @http://www.openrdf.org/doc/sesame2/system/
>
> the point is that there is allready everything you need like query engine and the
> like..
> to make it clear for beginning a quad store its close to perfect because it
> actually comes down to implement the getStatements method as accurate as possible.
>
> the query engine does the same by parsing the sparql query and using the getStatements
method.
>
> now this method simply has five arguments :
>
> subject, predicate, object, includeinferred and contexts, where subject predicate, object
can
> be null, includeinferred can be ignored for starting and contexts can also be null for
a starter
> or an array of uris.
>
> also note that the sail api is quite commonly used (virtuoso, openrdfsesame, neo4j, bigdata,
even oracle has an old version,
> we'll be having one implementation for talis and 4store in the coming weeks and of course
my quadstore "tuqs")
>
> if you find the way to retrieve the triples (quads) from hbase i could implement a sail
> store in a day - et voila ...
>
> anyways it would be nice if you keep me informed .. i'd really like to contribute...
>
> wkr www.turnguard.com
>
>
> ----- Original Message -----
> From: "Amandeep Khurana" <amansk@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Thursday, April 1, 2010 11:45:00 PM
> Subject: Re: Using SPARQL against HBase
>
> Andrew and I just had a chat about exploring how we can leverage HBase for a
> scalable RDF store and we'll be looking at it in more detail over the next
> few days. Is anyone of you interested in helping out? We are going to be
> looking at what all is required to build a triple store + query engine on
> HBase and how HBase can be used as is or remodeled to fit the problem.
> Depending on what we find out, we'll decide on taking the project further
> and committing efforts towards it.
>
> -Amandeep
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Thu, Apr 1, 2010 at 1:12 PM, Jürgen Jakobitsch <jakobitschj@punkt.at>wrote:
>
>> hi,
>>
>> this sounds very interesting to me, i'm currently fiddling
>> around with a suitable row and column setup for triples.
>>
>> i'm about to implement openrdf's sail api for hbase (i just did
>> a lucene quad store implementation which is superfast a scales
>> to a couple of hundreds of millions of triples (http://turnguard.com/tuqs
>> ))
>> but i'm in my first days of hbase encounters, so my experience
>> in row column design is manageable.
>>
>> from my point of view the problem is to really efficiantly store
>> besides the triples themselves the contexts (named graphs) and
>> languages of literal.
>>
>> by the way : i just did a small tablemanager (in beta) that lets
>> you create htables -> from <- rdf (see
>> http://sourceforge.net/projects/hbasetablemgr/)
>>
>> i'd be really happy to contribute on the rdf and sparql side,
>> but certainly could need some help on the hbase table design side.
>>
>> wkr www.turnguard.com/turnguard
>>
>>
>>
>> ----- Original Message -----
>> From: "Raffi Basmajian" <rbasmajian@oppenheimerfunds.com>
>> To: hbase-user@hadoop.apache.org, apurtell@apache.org
>> Sent: Thursday, April 1, 2010 9:45:59 PM
>> Subject: RE: Using SPARQL against HBase
>>
>>
>> This is an interesting article from a few guys over at BBN/Raytheon. By
>> storing triples in flat files theu used a custom algorithm, detailed in
>> the article, to iterate the WHERE clause from a SPARQL query and reduce
>> the map into the desired result.
>>
>> This is very similar to what I need to do; the only difference being
>> that our data is stored in Hbase tables, not as triples in flat files.
>>
>>
>> -----Original Message-----
>> From: Amandeep Khurana [mailto:amansk@gmail.com]
>> Sent: Wednesday, March 31, 2010 3:30 PM
>> To: hbase-user@hadoop.apache.org; apurtell@apache.org
>> Subject: Re: Using SPARQL against HBase
>>
>> Why do you need to build an in-memory graph which you would want to
>> read/write to? You could store the graph in HBase directly. As pointed
>> out, HBase might not be the best suited for SPARQL queries, but its not
>> impossible to do. Using the triples, you can form a graph that can be
>> represented in HBase as an adjacency list. I've stored graphs with
>> 16-17M nodes which was data equivalent to about 600M triples. And this
>> was on a small cluster and could certainly scale way more than 16M graph
>> nodes.
>>
>> In case you are interested in working on SPARQL over HBase, we could
>> collaborate on it...
>>
>> -ak
>>
>>
>> Amandeep Khurana
>> Computer Science Graduate Student
>> University of California, Santa Cruz
>>
>>
>> On Wed, Mar 31, 2010 at 11:56 AM, Andrew Purtell
>> <apurtell@apache.org>wrote:
>>
>> > Hi Raffi,
>> >
>> > To read up on fundamentals I suggest Google's BigTable paper:
>> > http://labs.google.com/papers/bigtable.html
>> >
>> > Detail on how HBase implements the BigTable architecture within the
>> > Hadoop ecosystem can be found here:
>> >
>> >  http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture
>> >  http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
>> >
>> > http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-l
>> > og.html
>> >
>> > Hope that helps,
>> >
>> >   - Andy
>> >
>> > > From: Basmajian, Raffi <rbasmajian@oppenheimerfunds.com>
>> > > Subject: RE: Using SPARQL against HBase
>> > > To: hbase-user@hadoop.apache.org, apurtell@apache.org
>> > > Date: Wednesday, March 31, 2010, 11:42 AM If Hbase can't respond to
>> > > SPARQL-like queries, then what type of query language can it respond
>>
>> > > to? In a traditional RDBMS database one would use SQL; so what is
>> > > the counterpart query language with Hbase?
>> >
>> >
>> >
>> >
>> >
>>
>>
>> ------------------------------------------------------------------------------
>> This e-mail transmission may contain information that is proprietary,
>> privileged and/or confidential and is intended exclusively for the person(s)
>> to whom it is addressed. Any use, copying, retention or disclosure by any
>> person other than the intended recipient or the intended recipient's
>> designees is strictly prohibited. If you are not the intended recipient or
>> their designee, please notify the sender immediately by return e-mail and
>> delete all copies. OppenheimerFunds may, at its sole discretion, monitor,
>> review, retain and/or disclose the content of all email communications.
>>
>> ==============================================================================
>>
>>
>> --
>> punkt. netServices
>> ______________________________
>> Jürgen Jakobitsch
>> Codeography
>>
>> Lerchenfelder Gürtel 43 Top 5/2
>> A - 1160 Wien
>> Tel.: 01 / 897 41 22 - 29
>> Fax: 01 / 897 41 22 - 22
>>
>> netServices http://www.punkt.at
>>
>>
>
> --
> punkt. netServices
> ______________________________
> Jürgen Jakobitsch
> Codeography
>
> Lerchenfelder Gürtel 43 Top 5/2
> A - 1160 Wien
> Tel.: 01 / 897 41 22 - 29
> Fax: 01 / 897 41 22 - 22
>
> netServices http://www.punkt.at
>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Mime
View raw message