hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuan-Fang Li <liyuanf...@gmail.com>
Subject Re: RDF Data Store on Hadoop
Date Fri, 03 Jul 2009 01:20:25 GMT
I recently came across this OSS project called BigData [1], with the
following description from their web site:
Bigdata(R) comes packaged with a very high-performance RDF store supporting
RDF(S) and OWL Lite inference. The Bigdata RDF Store is currently the only
RDF database capable of operating distributed on a cluster. The Bigdata RDF
Store was designed specifically to meet requirements for very large scale
semantic alignment and federation. RDF is a Semantic Web technology
particularly well-suited to modeling graph-shaped data and metadata, such as
an associative entity-link model, whereby actors are linked to one another
in an ad-hoc fashion within the context of an evolving ontology of concepts
for entity types and link types related to a particular problem domain. The
Bigdata RDF Store is used operationally in data harvesting systems to create
mash-ups of structured, semi-structured, and unstructured data from myriad
sources in a schema-flexible manner.

Having had a very brief look at the API, it seems that BigData is
interfacing with Sesame for RDF-related processing. It also seems that this
project has demonstrated some great scalability.

Does anybody have any experience with BigData?

Best wishes
Yuan-Fang

1. http://www.systap.com/bigdata.htm

On Fri, Jul 3, 2009 at 4:29 AM, Amandeep Khurana <amansk@gmail.com> wrote:

> I can share the data model right here with you. Beyond the data model, the
> MR jobs etc are specific to the data sources I am pulling into Hbase to
> connect with each other.
>
> I've attached an image that represents the table structure basics.
>
> Essentially, the column family and column identifier used in combination
> represent the predicate. The row id is the subject and the cell value is the
> object..
>
>
>
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Thu, Jul 2, 2009 at 11:20 AM, Brian MacKay <Brian.MacKay@medecision.com
> > wrote:
>
>> I understand, if you would consider open sourcing it as a rough
>> prototype, let us know.  ?
>>
>> -----Original Message-----
>> From: Amandeep Khurana [mailto:amansk@gmail.com]
>> Sent: Thursday, July 02, 2009 2:15 PM
>> To: common-user@hadoop.apache.org
>> Subject: Re: RDF Data Store on Hadoop
>>
>> Its a prototype right now and in a nascent stage. I havent made it open.
>> Essentially, its storing triples in Hbase so its just the data model
>> thats
>> solving some problems that I was working on. I dont have a SPARQL engine
>> built over it yet.
>>
>>
>>
>>
>> Amandeep Khurana
>> Computer Science Graduate Student
>> University of California, Santa Cruz
>>
>>
>> On Thu, Jul 2, 2009 at 11:13 AM, Brian MacKay
>> <Brian.MacKay@medecision.com>wrote:
>>
>> > Hi Amandeep,
>> >
>> > Is your custom RDF store over Hbase open source? If so, where is it
>> > hosted?
>> >
>> > Thanks,
>> > Brian
>> >
>> > -----Original Message-----
>> > From: Amandeep Khurana [mailto:amansk@gmail.com]
>> > Sent: Thursday, July 02, 2009 2:08 PM
>> > To: common-user@hadoop.apache.org
>> > Subject: Re: RDF Data Store on Hadoop
>> >
>> > Hi
>> >
>> > I have been working on this as well. There are a couple of more
>> threads
>> > on
>> > this and the Hbase mailing list about this. One is pretty recent - its
>> > about
>> > graph algorithms using map reduce.
>> >
>> > I built a custom RDF store over Hbase. Its not really an RDF store by
>> > all
>> > means, but the data model is something that can be extended over to
>> > support
>> > all specifications of RDF.
>> >
>> > The Heart project hasnt been active for some time now. If we have
>> people
>> > interested, we can take it on and work on creating an RDF store over
>> > Hbase
>> > alongwith graph algorithms using MR.
>> >
>> > Amandeep
>> >
>> >
>> > Amandeep Khurana
>> > Computer Science Graduate Student
>> > University of California, Santa Cruz
>> >
>> >
>> > On Thu, Jul 2, 2009 at 5:32 AM, Alex McLintock
>> > <alex.mclintock@gmail.com>wrote:
>> >
>> > > I'm looking to build up data for an RDF Data store using Hadoop.
>> > > I could just generate lots of RDF XML files - or a big one - and
>> feed
>> > > it into  Apache Jena..
>> > > However it seems to me that it would be best if I used a hadoop
>> aware
>> > > distributed triplestore so that my data stayed on the data nodes.
>> > >
>> > > I see that there is the Heart project (
>> http://rdf-proj.blogspot.com/
>> > > ) but it doesnt seem very active.
>> > >
>> > > Does anyone have any recommendations for a usable RDF data store
>> which
>> > > I can use with Hadoop? Or should I consider this outside of the
>> Hadoop
>> > > world and just put it on one machine? Is Heart "nearly there" and
>> just
>> > > needs a helping hand?
>> > >
>> > > I am tempted to bypass the RDF triplestore and role my own using
>> hBase
>> > > but I dont want to re-invent the wheel.
>> > >
>> > > Cheers
>> > >
>> > > Alex
>> > >
>> > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>> _ _ _
>> >
>> > The information transmitted is intended only for the person or entity
>> to
>> > which it is addressed and may contain confidential and/or privileged
>> > material. Any review, retransmission, dissemination or other use of,
>> or
>> > taking of any action in reliance upon, this information by persons or
>> > entities other than the intended recipient is prohibited. If you
>> received
>> > this message in error, please contact the sender and delete the
>> material
>> > from any computer.
>> >
>> >
>> >
>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>> _
>>
>> The information transmitted is intended only for the person or entity to
>> which it is addressed and may contain confidential and/or privileged
>> material. Any review, retransmission, dissemination or other use of, or
>> taking of any action in reliance upon, this information by persons or
>> entities other than the intended recipient is prohibited. If you received
>> this message in error, please contact the sender and delete the material
>> from any computer.
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message