giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mirko Kämpf <mirko.kae...@cloudera.com>
Subject Re: How to Write to HDFS?
Date Mon, 06 Oct 2014 21:16:10 GMT
Hi Tamer,

if you run Giraph on YARN you can use the log aggregation feature. If you
try to write to HDFS you should consider the HDFS API, but many mappers
would have to write into individual files. Why not writing all logs via
Log4j into Flume and from here to HDFS?

There is a Log4J appender for Flume and if you like you can index the
output in SOLR on the fly, using morphlines.

Best wishes,
Mirko

On Monday, October 6, 2014, Tamer Yousef <TYousef@boardreader.com> wrote:

>  To follow-up on my question, I have found the messages only in the Task
> logs (in Browser from the hadoop task logs).
>
> How do you write these same messages to hdfs output file?
>
>
>
>
>
> Thanks,
>
> -Tamer
>
>
>
> *From:* Tamer Yousef
> *Sent:* Monday, October 06, 2014 4:25 PM
> *To:* user@giraph.apache.org
> <javascript:_e(%7B%7D,'cvml','user@giraph.apache.org');>
> *Subject:* RE: How to Write to HDFS?
>
>
>
> Thanks Charith, but my main question still remains, even with the examples
> that comes with Giraph, such as simple shortest path computation example,
> the System.out.println or the Log.Debug (or I also tried Log.Info) they all
> do not print out customer messages that I write in the compute method.
>
>
>
> *For example,* I modified the class SimpleShortestPathsComputation to use
> println instead of Log.Debug, here is the compute method (I’ve highlighted
> the print statments):
>
>
>
> @Override
>
>   *public* *void* compute(
>
>       Vertex<LongWritable, DoubleWritable, FloatWritable> vertex,
>
>       Iterable<DoubleWritable> messages) *throws* IOException {
>
>     *if* (getSuperstep() == 0) {
>
>       vertex.setValue(*new* DoubleWritable(Double.MAX_VALUE));
>
>     }
>
>     *double* minDist = isSource(vertex) ? 0d : Double.MAX_VALUE;
>
>     *for* (DoubleWritable message : messages) {
>
>       minDist = Math.min(minDist, message.get());
>
>     }
>
>
>
>     System.out.println("Vertex " + vertex.getId() + " got minDist = " +
> minDist + " vertex value = " + vertex.getValue());
>
>
>
>             *if* (minDist < vertex.getValue().get()) {
>
>       vertex.setValue(*new* DoubleWritable(minDist));
>
>       *for* (Edge<LongWritable, FloatWritable> edge : vertex.getEdges())
>
>       {
>
>         *double* distance = minDist + edge.getValue().get();
>
>         System.out.println("Vertex " + vertex.getId() + " sent to " +
> edge.getTargetVertexId() + " = " + distance);
>
>         sendMessage(edge.getTargetVertexId(), *new*
> DoubleWritable(distance));
>
>       }
>
>     }
>
>     vertex.voteToHalt();
>
>   }
>
>
>
> Still the statements for println do not write these messages out, are they
> supposed to be somewhere else or are they are not written?
>
>
>
> I also tried with log.info, but again these statements were not written,
> I prefer to use println.
>
>
>
> Thanks,
>
> Tamer
>
>
>
>
>
>
>
> *From:* Charith Wickramarachchi [mailto:charith.dhanushka@gmail.com
> <javascript:_e(%7B%7D,'cvml','charith.dhanushka@gmail.com');>]
> *Sent:* Thursday, October 02, 2014 4:53 PM
> *To:* user
> *Subject:* Re: How to Write to HDFS?
>
>
>
> Hi Tamer,
>
>
>
> The reason you see this behavior is IntIntNullTextInputFormat sets the
> value of the vertex as same as the vertex id when creating a vertex. Since
> you do not change the value vertex id will be written to the output as the
> vertex value.
>
>
>
> See the class org.apache.giraph.io.formats.IntIntNullTextInputFormat.IntIntNullVertexReader
> and you will understand.
>
>
>
> Hope this helps.
>
>
>
> Thanks,
> Charith
>
>
>
>
>
> On Thu, Oct 2, 2014 at 1:35 PM, Tamer Yousef <TYousef@boardreader.com
> <javascript:_e(%7B%7D,'cvml','TYousef@boardreader.com');>> wrote:
>
>    Hello All!
>
> I’m learning Giraph and trying few things, but I fail to write out output
> to hdfs. I created my own .java file, and I placed it in the folder  ${GIRAPH_HOME}//giraph-examples/src/main/java/org/apache/giraph/examples/
> then I ran the mvn compile to get a jar file that includes my class. The
> function is doing nothing other than trying to:
>
> 1-       Write using the stdout
>
> 2-       Write using log4j
>
>
>
> The program runs and it creates an output directory in hdfs as I specify
> in the command below, but the output file does not reflect what the program
> should write out.
>
> Here is the output I get in the output file in HDFS (the vertices  I have
> are very similar):
>
>
>
> 6              6
>
> 5              5
>
> 13           13
>
> 12           12
>
> 8              8
>
> 7              7
>
> 2              2
>
> 15           15
>
> 9              9
>
> 16           16
>
> 10           10
>
> 1              1
>
> 3              3
>
> 14           14
>
> 11           11
>
> 4              4
>
>
>
> Even if I completely comment out the code in the compute class, I still
> get the output above (with keeping the voteToHalt method).
>
> I execute the code using the command:
>
>
>
> hadoop jar
> $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar
> org.apache.giraph.GiraphRunner org.apache.giraph.examples.HelloWorld -vif
> org.apache.giraph.io.formats.IntIntNullTextInputFormat -vip /in/graph2.txt
> -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /out5 -w 1
>
>
>
> *I’m working with Hadoop 1.2.1 and the latest Giraph from the trunk.*
>
>
>
> *and here is my full class:*
>
>
>
> *package* org.apache.giraph.examples;
>
> *import* org.apache.giraph.GiraphRunner;
>
> *import* org.apache.hadoop.util.ToolRunner;
>
> *import* org.apache.giraph.graph.BasicComputation;
>
> *import* org.apache.giraph.conf.LongConfOption;
>
> *import* org.apache.giraph.edge.Edge;
>
> *import* org.apache.giraph.graph.Vertex;
>
> *import* org.apache.hadoop.io.IntWritable;
>
> *import* org.apache.hadoop.io.NullWritable;
>
> *import* org.apache.hadoop.io.DoubleWritable;
>
> *import* org.apache.hadoop.io.FloatWritable;
>
> *import* org.apache.hadoop.io.LongWritable;
>
> *import* org.apache.log4j.Logger;
>
>
>
> *import* java.io.IOException;
>
>
>
> @Algorithm(
>
>     name = "Hellow",
>
>     description = "test class"
>
> )
>
>   *public* *class* HelloWorld *extends*
>
>                         BasicComputation<IntWritable, IntWritable,
>
>                         NullWritable, NullWritable> {
>
>                         @Override
>
>                         *public* *void* compute(Vertex<IntWritable,
>
>                                                 IntWritable, NullWritable>
> vertex,
>
>                                                 Iterable<NullWritable>
> messages) {
>
>
>
>                                     System.out.println("Hello world from
> print ln");
>
>                                     LOG.info("Hello world from log info");
>
>
>
>                                     vertex.voteToHalt();
>
>                         }
>
>                         *public* *static* *void* main(String[] args)
> *throws* Exception
>
>                         {
>
>                                     //log4j.logger.org.apache.hadoop =
> DEBUG;
>
>                                     System.exit(ToolRunner.run(*new*
> GiraphRunner(), args));
>
>                         }
>
>                         /** Class logger */
>
>                         *private* *static* *final* Logger LOG =
>
>
> Logger.getLogger(SimpleShortestPathsComputation.*class*);
>
>
>
>             }
>
>
>
> *Any ideas?*
>
>
>
> *Thanks!*
>
>
>
>
>
> --
>
> Charith Dhanushka Wickramaarachchi
>
>
>
> Tel  +1 213 447 4253
>
> Web  http://apache.org/~charith <http://www-scf.usc.edu/~cwickram/>
>
> Blog  http://charith.wickramaarachchi.org/
> <http://charithwiki.blogspot.com/>
>
> Twitter  @charithwiki <https://twitter.com/charithwiki>
>
>
>
> This communication may contain privileged or other
> confidential information and is intended exclusively for the addressee/s.
> If you are not the intended recipient/s, or believe that you may have
> received this communication in error, please reply to the
> sender indicating that fact and delete the copy you received and in
> addition, you should not print, copy, retransmit, disseminate, or otherwise
> use the information contained in this communication.
> Internet communications cannot be guaranteed to be timely, secure, error
> or virus-free. The sender does not accept liability for any errors
> or omissions
>


-- 
Sent from Gmail Mobile

Mime
View raw message