giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maja Kabiljo <majakabi...@fb.com>
Subject Re: Multiple Data Sources
Date Tue, 16 Jul 2013 06:02:01 GMT
Hi Tom,

We recently added something like this, please take a look at MultiVertexInputFormat. That
one can basically wrap any number of vertex input formats, coming from any sources. You can
also take a look at HiveGiraphRunner to see how it's used there. As for multiple vertex types,
we don't have that directly supported, but you can have some variable describing the vertex
type inside of your vertex value.

Hope this helps, please let us know if you have any questions!

Maja

From: Tom M <thnyanmthnyan@gmail.com<mailto:thnyanmthnyan@gmail.com>>
Reply-To: "user@giraph.apache.org<mailto:user@giraph.apache.org>" <user@giraph.apache.org<mailto:user@giraph.apache.org>>
Date: Monday, July 15, 2013 9:54 AM
To: "user@giraph.apache.org<mailto:user@giraph.apache.org>" <user@giraph.apache.org<mailto:user@giraph.apache.org>>
Subject: Multiple Data Sources

Hi,

    I am a new to Giraph. I am working on implementing a graph algorithm that first reads
vertex values from multiple sources (HDFS, MySQL). So basically, I would have two types of
vertices, values of each vertex type can be read from a different data source. I know that,
in MR, we can use DBInputFormat to retrieve tuples from RDBMS for example, and then join them
with data read from HDFS. My question, can we do that in Giraph? i.e. can the graph be constructed
from different data sources? Thanks a lot in advance.

Best,
Tom

Mime
View raw message