Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 80819 invoked from network); 1 Apr 2010 19:46:32 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Apr 2010 19:46:32 -0000 Received: (qmail 16224 invoked by uid 500); 1 Apr 2010 19:46:31 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 16178 invoked by uid 500); 1 Apr 2010 19:46:31 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 16169 invoked by uid 99); 1 Apr 2010 19:46:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Apr 2010 19:46:31 +0000 X-ASF-Spam-Status: No, hits=0.1 required=10.0 tests=AWL,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.203.94.80] (HELO mailgate4.oppenheimerfunds.com) (208.203.94.80) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Apr 2010 19:46:26 +0000 X-WSS-ID: 0L07QWI-0D-KBA-02 X-M-MSG: Received: from den-mailsecure.den.ofi.com (unknown [172.17.35.17]) by mailgate4.oppenheimerfunds.com (Tumbleweed MailGate 3.6.1) with ESMTP id 2F87B1D8A141; Thu, 1 Apr 2010 13:45:54 -0600 (MDT) Received: from [10.10.1.114] by den-mailsecure2.den.ofi.com with ESMTP ( SMTP Relay (Email Firewall v6.3.2)); Thu, 01 Apr 2010 13:45:59 -0600 X-Server-Uuid: 64A5797D-808F-473D-B095-04C91B051D03 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Subject: RE: Using SPARQL against HBase Date: Thu, 1 Apr 2010 15:45:59 -0400 Message-ID: <3D03C3BE6FB06149B2465A329BF20EF496B8A2@OPP-XMAIL1.ny.ofi.com> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Using SPARQL against HBase thread-index: AcrRCKYe9wllIIP8TD6oqzPZw6zV3wAuIieA References: <3D03C3BE6FB06149B2465A329BF20EF496B896@OPP-XMAIL1.ny.ofi.com> <298940.45443.qm@web65509.mail.ac4.yahoo.com> From: "Basmajian, Raffi" To: hbase-user@hadoop.apache.org, apurtell@apache.org X-TMWD-Spam-Summary: TS=20100401194600; ID=1; SEV=2.3.1; DFV=B2010040121; IFV=NA; AIF=B2010040121; RPD=5.03.0010; ENG=NA; RPDID=7374723D303030312E30413031303230392E34424234463746382E303031312C73733D312C6667733D30; CAT=NONE; CON=NONE; SIG=AAAAAAAAAAAAAAAAAAAAAAAAfQ== X-MMS-Spam-Filter-ID: B2010040121_5.03.0010 X-WSS-ID: 67AA287D2BW4279156-01-01 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable This is an interesting article from a few guys over at BBN/Raytheon. By storing triples in flat files theu used a custom algorithm, detailed in the article, to iterate the WHERE clause from a SPARQL query and reduce the map into the desired result.=20 This is very similar to what I need to do; the only difference being that our data is stored in Hbase tables, not as triples in flat files.=20 =20 -----Original Message----- =46rom: Amandeep Khurana [mailto:amansk@gmail.com]=20 Sent: Wednesday, March 31, 2010 3:30 PM To: hbase-user@hadoop.apache.org; apurtell@apache.org Subject: Re: Using SPARQL against HBase Why do you need to build an in-memory graph which you would want to read/write to=3F You could store the graph in HBase directly. As pointed out, HBase might not be the best suited for SPARQL queries, but its not impossible to do. Using the triples, you can form a graph that can be represented in HBase as an adjacency list. I've stored graphs with 16-17M nodes which was data equivalent to about 600M triples. And this was on a small cluster and could certainly scale way more than 16M graph nodes. In case you are interested in working on SPARQL over HBase, we could collaborate on it... -ak Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Mar 31, 2010 at 11:56 AM, Andrew Purtell wrote: > Hi Raffi, > > To read up on fundamentals I suggest Google's BigTable paper: > http://labs.google.com/papers/bigtable.html > > Detail on how HBase implements the BigTable architecture within the=20 > Hadoop ecosystem can be found here: > > http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture > http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html > > http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-l > og.html > > Hope that helps, > > - Andy > > > From: Basmajian, Raffi > > Subject: RE: Using SPARQL against HBase > > To: hbase-user@hadoop.apache.org, apurtell@apache.org > > Date: Wednesday, March 31, 2010, 11:42 AM If Hbase can't respond to=20 > > SPARQL-like queries, then what type of query language can it respond > > to=3F In a traditional RDBMS database one would use SQL; so what is=20 > > the counterpart query language with Hbase=3F > > > > > ---------------------------------------------------------------------------= --- This e-mail transmission may contain information that is proprietary, = privileged and/or confidential and is intended exclusively for the person(s= )= to whom it is addressed. Any use, copying, retention or disclosure by any = person other than the intended recipient or the intended recipient's = designees is strictly prohibited. If you are not the intended recipient or = their designee, please notify the sender immediately by return e-mail and = delete all copies. OppenheimerFunds may, at its sole discretion, monitor, = review, retain and/or disclose the content of all email communications.=20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D