giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yasser Altowim <yasser.alto...@ericsson.com>
Subject RE: MultiVertexInputFormat
Date Mon, 19 Aug 2013 16:16:41 GMT
Hi Guys,

     Any help on this will be appreciated. I am repeating my question and my code below:


I am implementing an algorithm in Giraph that reads the vertex values from two input files,
each has its own format. I am not using  any EdgeInputFormatClass. I am now using VertexInputFormatDescription
along with MultiVertexInputFormats, but still could not figure out how to set the Vertex input
path for each Input Format Class. Can you please take a look at my code below and show me
how to set the Vertex Input Path? I have taken a look at HiveGiraphRunner but still no luck.
Thanks

    if (null == getConf()) {
        conf = new Configuration();
    }

    GiraphConfiguration gconf = new GiraphConfiguration(getConf());
    int workers = Integer.parseInt(arg0[2]);
    gconf.setWorkerConfiguration(workers, workers, 100.0f);

    List<VertexInputFormatDescription> vertexInputDescriptions = Lists.newArrayList();

    // Input one
    VertexInputFormatDescription description1 = new VertexInputFormatDescription(UseCase1FirstVertexInputFormat.class);
    // how to set the vertex input path? i.e. how to say that I want to read file1.txt using
this input format class
    vertexInputDescriptions.add(description1);

    // Input two
    VertexInputFormatDescription description2 = new VertexInputFormatDescription(UseCase1SecondVertexInputFormat.class);
    // how to set the vertex input path?
    vertexInputDescriptions.add(description2);


    GiraphConstants.VERTEX_INPUT_FORMAT_CLASS.set(gconf,
                                                                                MultiVertexInputFormat.class);
    VertexInputFormatDescription.VERTEX_INPUT_FORMAT_DESCRIPTIONS.set(gconf,InputFormatDescription.toJsonString(vertexInputDescriptions));

    gconf.setVertexOutputFormatClass(UseCase1OutputFormat.class);
    gconf.setComputationClass(UseCase1Vertex.class);
    GiraphJob job = new GiraphJob(gconf, "Use Case 1");
    FileOutputFormat.setOutputPath(job.getInternalJob(), new Path(arg0[1]));
    return job.run(true) ? 0 : -1;


Thanks in advance.

Best,
Yasser

From: Yasser Altowim [mailto:yasser.altowim@ericsson.com]
Sent: Friday, August 16, 2013 11:36 AM
To: user@giraph.apache.org
Subject: RE: MultiVertexInputFormat

Thanks a lot Avery for your response. I am now using VertexInputFormatDescription, but still
could not figure out how to set the Vertex input path. I just need to read the vertex values
from two different files, each with its own format. I am not using  any EdgeInputFormatClass.

         Can you please take a look at my code below and show me how to set the Vertex Input
Path? Thanks


if (null == getConf()) {
                conf = new Configuration();
           }

           GiraphConfiguration gconf = new GiraphConfiguration(getConf());
           int workers = Integer.parseInt(arg0[2]);
           gconf.setWorkerConfiguration(workers, workers, 100.0f);



           List<VertexInputFormatDescription> vertexInputDescriptions = Lists.newArrayList();

           // Input one
           VertexInputFormatDescription description1 = new VertexInputFormatDescription(UseCase1FirstVertexInputFormat.class);
           // how to set the vertex input path?
           vertexInputDescriptions.add(description1);

          // Input two
           VertexInputFormatDescription description2 = new VertexInputFormatDescription(UseCase1SecondVertexInputFormat.class);
           // how to set the vertex input path?
           vertexInputDescriptions.add(description2);


          VertexInputFormatDescription.VERTEX_INPUT_FORMAT_DESCRIPTIONS.set(gconf,InputFormatDescription.toJsonString(vertexInputDescriptions));


           gconf.setVertexOutputFormatClass(UseCase1OutputFormat.class);
           gconf.setComputationClass(UseCase1Vertex.class);
           GiraphJob job = new GiraphJob(gconf, "Use Case 1");
           FileOutputFormat.setOutputPath(job.getInternalJob(), new Path(arg0[1]));
           return job.run(true) ? 0 : -1;



Best,
Yasser

From: Avery Ching [mailto:aching@apache.org]
Sent: Friday, August 16, 2013 9:50 AM
To: user@giraph.apache.org<mailto:user@giraph.apache.org>
Subject: Re: MultiVertexInputFormat

This is doable in Giraph, you can use as many vertex or edge input formats as you like (via
GIRAPH-639).  You just need to choose MultiVertexInputFormat and/or MultiEdgeInputFromat

See VertexInputFormatDescription for vertex input formats

  /**
   * VertexInputFormats description - JSON array containing a JSON array for
   * each vertex input. Vertex input JSON arrays contain one or two elements -
   * first one is the name of vertex input class, and second one is JSON object
   * with all specific parameters for this vertex input. For example:
   * [["VIF1",{"p":"v1"}],["VIF2",{"p":"v2","q":"v"}]]
   */
  public static final StrConfOption VERTEX_INPUT_FORMAT_DESCRIPTIONS =
      new StrConfOption("giraph.multiVertexInput.descriptions", null,
          "VertexInputFormats description - JSON array containing a JSON " +
          "array for each vertex input. Vertex input JSON arrays contain " +
          "one or two elements - first one is the name of vertex input " +
          "class, and second one is JSON object with all specific parameters " +
          "for this vertex input. For example: [[\"VIF1\",{\"p\":\"v1\"}]," +
          "[\"VIF2\",{\"p\":\"v2\",\"q\":\"v\"}]]\"");

See EdgeInputFormatDescription for edge input formats

  /**
   * EdgeInputFormats description - JSON array containing a JSON array for
   * each edge input. Edge input JSON arrays contain one or two elements -
   * first one is the name of edge input class, and second one is JSON object
   * with all specific parameters for this edge input. For example:
   * [["EIF1",{"p":"v1"}],["EIF2",{"p":"v2","q":"v"}]]
   */
  public static final StrConfOption EDGE_INPUT_FORMAT_DESCRIPTIONS =
      new StrConfOption("giraph.multiEdgeInput.descriptions", null,
          "EdgeInputFormats description - JSON array containing a JSON array " +
          "for each edge input. Edge input JSON arrays contain one or two " +
          "elements - first one is the name of edge input class, and second " +
          "one is JSON object with all specific parameters for this edge " +
          "input. For example: [[\"EIF1\",{\"p\":\"v1\"}]," +
          "[\"EIF2\",{\"p\":\"v2\",\"q\":\"v\"}]]");

Hope that helps,

Avery

On 8/16/13 8:45 AM, Yasser Altowim wrote:
Guys, any help with this will be appreciated. Thanks.

From: Yasser Altowim [mailto:yasser.altowim@ericsson.com]
Sent: Thursday, August 15, 2013 2:07 PM
To: user@giraph.apache.org<mailto:user@giraph.apache.org>
Subject: MultiVertexInputFormat

Hi,

             I am implementing an algorithm using Giraph. My  algorithm needs to read input
data from two files, each has its own format. My questions are:


1.       How can I use the MultiVertexInputFormat class? Is there any example that shows how
this class can be used?

2.       How can I specify this class when running my job using the Giraph Runner or using
a driver class?

Thanks in advance.

Best,
Yasser



Mime
View raw message