hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From developer wang <developer...@gmail.com>
Subject Re: Some question about hama-0.6.3
Date Fri, 28 Feb 2014 02:55:09 GMT
Thank you very much for the previous useful reply.
But I encounters with other problems.
-----------------------------------------------------------------------------------------------------------------------------------------------
1 Problem
I tried the trunk (with the commit-point in
git: 1b3f1744a33a29686c2eafe7764bb3640938fcc8), but it can not pass the
compilation phase (I use this command:  *mvn install package
-Dmaven.test.skip=true*) it complained this:

[INFO]
[INFO]
------------------------------------------------------------------------
[INFO] Building graph 0.7.0-SNAPSHOT
[INFO]
------------------------------------------------------------------------
[INFO]
------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Hama parent POM ............................ SUCCESS [2.960s]
[INFO] pipes ............................................. SUCCESS [10.033s]
[INFO] commons ........................................... SUCCESS [6.664s]
[INFO] core .............................................. SUCCESS [23.909s]
[INFO] graph ............................................. FAILURE [0.048s]
[INFO] machine learning .................................. SKIPPED
[INFO] examples .......................................... SKIPPED
[INFO] hama-dist ......................................... SKIPPED
[INFO]
------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO]
------------------------------------------------------------------------
[INFO] Total time: 44.102s
[INFO] Finished at: Fri Feb 28 10:06:23 HKT 2014
[INFO] Final Memory: 50M/384M
[INFO]
------------------------------------------------------------------------
[ERROR] Failed to execute goal on project hama-graph: *Could not resolve
dependencies for project
org.apache.hama:hama-graph:jar:0.7.0-SNAPSHOT:*Failure to find
org.apache.hama:hama-core:jar:tests:0.7.0-SNAPSHOT in
https://repository.cloudera.com/artifactory/cloudera-repos was cached in
the local repository, resolution will not be reattempted until the update
interval of cloudera-repo has elapsed or updates are forced -> [Help 1]

*Does this mean you forget upload some jars to the maven remote repository?*
(I can the above command to compile 0.6.3)

I surveyed about this question on the Internet, some says I should run
maven with -U.
So I tried to this command:  *mvn -U compile*
*but it still fails. almost the same error:*
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-remote-resources-plugin:1.1:process
(default) on project hama-graph: Failed to resolve dependencies for one or
more projects in the reactor. Reason: Missing:
[ERROR] ----------
[ERROR] 1) org.apache.hama:hama-core:test-jar:tests:0.7.0-SNAPSHOT
[ERROR]
[ERROR] Try downloading the file manually from the project website.
[ERROR]
[ERROR] Then, install it using the command:
[ERROR] mvn install:install-file -DgroupId=org.apache.hama
-DartifactId=hama-core -Dversion=0.7.0-SNAPSHOT -Dclassifier=tests
-Dpackaging=test-jar -Dfile=/path/to/file
[ERROR]
[ERROR] Alternatively, if you host your own repository you can deploy the
file there:
[ERROR] mvn deploy:deploy-file -DgroupId=org.apache.hama
-DartifactId=hama-core -Dversion=0.7.0-SNAPSHOT -Dclassifier=tests
-Dpackaging=test-jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
[ERROR]
[ERROR] Path to dependency:
[ERROR] 1) org.apache.hama:hama-graph:jar:0.7.0-SNAPSHOT
[ERROR] 2) org.apache.hama:hama-core:test-jar:tests:0.7.0-SNAPSHOT
[ERROR]
[ERROR] ----------
[ERROR] 1 required artifact is missing.





-----------------------------------------------------------------------------------------------------------------------------------------------

Problem 2:
Compared to 0.6.0, you use VerticesInfo to store the vertices, and in
the DiskVerticesInfo, they serialize and deserialize every vertices into
the local file system.
Is this used to provides fault tolerance (the checkpoint part)? Or is it
designed for other purposes?

If it is designed for checkpoint part in fault tolerance, why it write to
local disk but not HDFS?
In my mind, if a machine crashed, if the fault tolerance mechanism depends
on the manual reboot or repair of the crashed machine, the potential
lengthy recovery time is intolerant.
Do you agree with me?
Or maybe you have other trade-off?



2014-02-28 8:07 GMT+08:00 Edward J. Yoon <edwardyoon@apache.org>:

> In 0.6.3, you can use only ListVerticesInfo. Please use the TRUNK if you
> want.
>
> And, vertices doesn't occupy large memory. Please use ListVerticesInfo.
>
> FT is not supported yet.
>
> On Thu, Feb 27, 2014 at 10:06 PM, developer wang <developer.pw@gmail.com>
> wrote:
> > Hi, all.
> >   Thank you for your detailed reply.
> >
> >   In the previous test, I used a incomplete graph to run PageRank, then I
> > got this error:
> >   java.lang.IllegalArgumentException: Messages must never be behind the
> > vertex in ID! Current Message ID:
> >
> >   With your detailed reply, I knew it was because some vertices tried to
> > send messages to dangling nodes (Actually 0.6.0 can handle this by
> adding a
> > repair phase). Then I fixed this by adding dangling nodes explicitly
> with a
> > line which only contains a vertex id.
> >
> >   After this, I could run the PageRank (in the attachment) with the
> > ListVerticesInfo.
> >
> >   But if I use DiskVerticesInfo instead of DiskVerticesInfo by this:
> >   pageJob.set("hama.graph.vertices.info",
> > "org.apache.hama.graph.DiskVerticesInfo");
> >
> >   I will still get the below error:
> >  java.lang.IllegalArgumentException: Messages must never be behind the
> > vertex in ID! Current Message ID:
> >
> >  What's the problem?
> >  Do I use DiskVerticesInfo correctly?
> >
> >  Or if I want run my application with fault tolerance, what should I do?
> >
> >  Thank you very much.
> >
> >
> >
> > 2014-02-26 18:27 GMT+08:00 Edward J. Yoon <edwardyoon@apache.org>:
> >
> >> > Could you answer this question:
> >> > I found during the loading,  peers would not exchange vertices with
> each
> >> > other as hama 0.6.0 did.
> >> > So how does hama 0.6.3 solve the problem below: a peer load a vertex
> >> > which
> >> > is belong to another peer? (for example, suppose 3 peers for this task
> >> > and
> >> > the partitoner is Hash, peer #1 loads vertex 2, in 0.6.3, peer #2 did
> >> > not
> >> > send vertex 2 to peer #2)
> >>
> >> Instead of network communication, 0.6.3 uses file communication for
> >> input data partitioning. Please see
> >>
> >>
> http://svn.apache.org/repos/asf/hama/trunk/core/src/main/java/org/apache/hama/bsp/PartitioningRunner.java
> >>
> >> On Wed, Feb 26, 2014 at 6:03 PM, developer wang <developer.pw@gmail.com
> >
> >> wrote:
> >> > Actually I comment the set statement
> >> >     //pageJob.set("hama.graph.self.ref", "true");
> >> >
> >> > and In GraphJobRunner:
> >> >     final boolean selfReference =
> conf.getBoolean("hama.graph.self.ref",
> >> > false);
> >> >
> >> > And I will explicitly set hama.graph.self.ref to false, and use a
> >> > complete
> >> > graph to have a try again.
> >> >
> >> >
> >> > Could you answer this question:
> >> > I found during the loading,  peers would not exchange vertices with
> each
> >> > other as hama 0.6.0 did.
> >> > So how does hama 0.6.3 solve the problem below: a peer load a vertex
> >> > which
> >> > is belong to another peer? (for example, suppose 3 peers for this task
> >> > and
> >> > the partitoner is Hash, peer #1 loads vertex 2, in 0.6.3, peer #2 did
> >> > not
> >> > send vertex 2 to peer #2)
> >> >
> >> > or I have some misunderstanding about hama 0.6.3 or above? (In the
> last
> >> > years, I used 0.6.0 to do the daily job)
> >> >
> >> >
> >> >
> >> > 2014-02-26 16:48 GMT+08:00 Edward J. Yoon <edwardyoon@apache.org>:
> >> >
> >> >> The background is described here:
> >> >> https://issues.apache.org/jira/browse/HAMA-758
> >> >>
> >> >> On Wed, Feb 26, 2014 at 5:38 PM, Edward J. Yoon <
> edwardyoon@apache.org>
> >> >> wrote:
> >> >> > Oh, please try after set "hama.check.missing.vertex" to false
in
> job
> >> >> > configuration.
> >> >> >
> >> >> > On Wed, Feb 26, 2014 at 5:14 PM, developer wang
> >> >> > <developer.pw@gmail.com>
> >> >> > wrote:
> >> >> >> Thank you very much.
> >> >> >>
> >> >> >> Since I think the framework should not decide whether the
graph
> >> >> >> should
> >> >> >> self-reference, so I disable this config. (Actually when I
used
> >> >> >> 0.6.0,
> >> >> >> I
> >> >> >> also disabled this config)
> >> >> >>
> >> >> >> Since I use my PC to test whether my application works, I
use a
> >> >> >> small
> >> >> >> graph.
> >> >> >> (It does have a lot of dangling node)
> >> >> >>
> >> >> >> The dataset and the PageRank is attached.
> >> >> >>
> >> >> >> Thank you very much.
> >> >> >>
> >> >> >>
> >> >> >> 2014-02-26 16:04 GMT+08:00 Edward J. Yoon <edwardyoon@apache.org
> >:
> >> >> >>
> >> >> >>> Hi Wang,
> >> >> >>>
> >> >> >>> Can you send me your input data so that I can debug?
> >> >> >>>
> >> >> >>> On Wed, Feb 26, 2014 at 4:55 PM, developer wang
> >> >> >>> <developer.pw@gmail.com>
> >> >> >>> wrote:
> >> >> >>> > Firstly, thank you very much for reply.
> >> >> >>> >
> >> >> >>> > But in the log, I found "14/02/25 16:45:00 INFO
> >> >> >>> > graph.GraphJobRunner:
> >> >> >>> > 2918
> >> >> >>> > vertices are loaded into localhost:60340 "
> >> >> >>> > So it had finished the loading phase. is this true?
> >> >> >>> >
> >> >> >>> > Another problem is that:
> >> >> >>> > I found during the loading,  peers would not exchange
vertices
> >> >> >>> > with
> >> >> >>> > each
> >> >> >>> > other as hama 0.6.0 did.
> >> >> >>> > So how does hama 0.6.3 solve the problem below: a
peer load a
> >> >> >>> > vertex
> >> >> >>> > which
> >> >> >>> > is belong to another peer? (for example, suppose
3 peers for
> this
> >> >> >>> > task
> >> >> >>> > and
> >> >> >>> > the partitoner is Hash, peer #1 loads vertex 2, in
0.6.3, peer
> #2
> >> >> >>> > did
> >> >> >>> > not
> >> >> >>> > send vertex 2 to peer #2)
> >> >> >>> >
> >> >> >>> >
> >> >> >>> > 2014-02-26 15:46 GMT+08:00 Edward J. Yoon
> >> >> >>> > <edwardyoon@apache.org>:
> >> >> >>> >
> >> >> >>> >> > I tried PageRank with a small input of my
own.
> >> >> >>> >>
> >> >> >>> >> Hi Wang,
> >> >> >>> >>
> >> >> >>> >> This error often occurs when there is a record
conversion
> error.
> >> >> >>> >> So,
> >> >> >>> >> you should check whether the vertex reader works
correctly.
> >> >> >>> >>
> >> >> >>> >> And, I highly recommend you to use latest TRUNK
version[1] as
> >> >> >>> >> possible.
> >> >> >>> >>
> >> >> >>> >> 1.
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >>
> http://wiki.apache.org/hama/GettingStarted#Build_latest_version_from_source
> >> >> >>> >>
> >> >> >>> >> Thank you.
> >> >> >>> >>
> >> >> >>> >> On Wed, Feb 26, 2014 at 1:44 PM, developer wang
> >> >> >>> >> <developer.pw@gmail.com>
> >> >> >>> >> wrote:
> >> >> >>> >> > Hi, all.
> >> >> >>> >> >    I am Peng Wang, a student trying to use
and learn Hama.
> >> >> >>> >> >
> >> >> >>> >> >    I cloned the develop git repository of
Hama.
> >> >> >>> >> >
> >> >> >>> >> >    I firstly tried the newest version in
the tag, the tag:
> >> >> >>> >> > 0.7.0-SNAPSHOT.
> >> >> >>> >> > commit bef419747695d15de8a1087f44028ee40571b5f9
> >> >> >>> >> > Author: Edward J. Yoon <edwardyoon@apache.org>
> >> >> >>> >> > Date:   Fri Mar 29 00:44:59 2013 +0000
> >> >> >>> >> >
> >> >> >>> >> >     [maven-release-plugin]  copy for tag
0.7.0-SNAPSHOT
> >> >> >>> >> >
> >> >> >>> >> >     git-svn-id:
> >> >> >>> >> >
> >> >> >>> >> >
> https://svn.apache.org/repos/asf/hama/tags/0.7.0-SNAPSHOT@1462366
> >> >> >>> >> > 13f79535-47bb-0310-9956-ffa450edef68
> >> >> >>> >> >
> >> >> >>> >> >   But the tag: 0.6.3-RC3
> >> >> >>> >> > commit c9526b1272c83d641332667ce5d81d7ccc94be06
> >> >> >>> >> > Author: Edward J. Yoon <edwardyoon@apache.org>
> >> >> >>> >> > Date:   Sun Oct 6 08:27:00 2013 +0000
> >> >> >>> >> >
> >> >> >>> >> >     [maven-release-plugin]  copy for tag
0.6.3-RC3
> >> >> >>> >> >
> >> >> >>> >> >     git-svn-id:
> >> >> >>> >> >
> https://svn.apache.org/repos/asf/hama/tags/0.6.3-RC3@1529594
> >> >> >>> >> > 13f79535-47bb-0310-9956-ffa450edef68
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >    From the commit log, 0.7.0-SNAPSHOT is
earlier than
> >> >> >>> >> > 0.6.3-RC3,
> >> >> >>> >> >    So I used 0.6.3-RC3 instead of 0.7.0-SNAPSHOT
(but on the
> >> >> >>> >> > website
> >> >> >>> >> > of
> >> >> >>> >> > hama, 0.7.0-SNAPSHOT is the newest version)
> >> >> >>> >> >
> >> >> >>> >> >    Then I deployed Hama with the Pseudo
Distributed Mode on
> my
> >> >> >>> >> > desktop
> >> >> >>> >> > with
> >> >> >>> >> > 3 task runners.
> >> >> >>> >> >    I tried PageRank with a small input of
my own.
> >> >> >>> >> >    But it failes. And its log is:
> >> >> >>> >> > java.lang.IllegalArgumentException: Messages
must never be
> >> >> >>> >> > behind
> >> >> >>> >> > the
> >> >> >>> >> > vertex
> >> >> >>> >> > in ID! Current Message ID: 100128 vs. 1004
> >> >> >>> >> >         at
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:306)
> >> >> >>> >> >         at
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:254)
> >> >> >>> >> >         at
> >> >> >>> >> >
> >> >> >>> >> >
> org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:145)
> >> >> >>> >> >         at
> >> >> >>> >> > org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:177)
> >> >> >>> >> >         at org.apache.hama.bsp.BSPTask.run(BSPTask.java:146)
> >> >> >>> >> >         at
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1246)
> >> >> >>> >> >
> >> >> >>> >> >    Could you tell me what is the problem
in my situation?
> >> >> >>> >> >
> >> >> >>> >> >    I check whether hama had finished the
loading phase, and
> I
> >> >> >>> >> > found
> >> >> >>> >> > "14/02/25 16:45:00 INFO graph.GraphJobRunner:
2918 vertices
> >> >> >>> >> > are
> >> >> >>> >> > loaded
> >> >> >>> >> > into
> >> >> >>> >> > localhost:60340 "in the log.
> >> >> >>> >> >    So it had finished the loading phase.
> >> >> >>> >> >
> >> >> >>> >> >    After this, I read the source code, and
I found during
> the
> >> >> >>> >> > loading,
> >> >> >>> >> > peers would not exchange vertices with each
other as hama
> >> >> >>> >> > 0.5.0
> >> >> >>> >> > did.
> >> >> >>> >> >    So how does hama 0.6.3 solve the problem
below: a peer
> load
> >> >> >>> >> > a
> >> >> >>> >> > vertex
> >> >> >>> >> > which is belong to another peer?
> >> >> >>> >> >
> >> >> >>> >> >    Could you tell which branch or tag is
a stable version?
> >> >> >>> >> >    And does it support fault tolerance for
graph algorithms?
> >> >> >>> >> > and
> >> >> >>> >> > how
> >> >> >>> >> > can I
> >> >> >>> >> > get it?
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> --
> >> >> >>> >> Edward J. Yoon (@eddieyoon)
> >> >> >>> >> Chief Executive Officer
> >> >> >>> >> DataSayer, Inc.
> >> >> >>> >
> >> >> >>> >
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> --
> >> >> >>> Edward J. Yoon (@eddieyoon)
> >> >> >>> Chief Executive Officer
> >> >> >>> DataSayer, Inc.
> >> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Edward J. Yoon (@eddieyoon)
> >> >> > Chief Executive Officer
> >> >> > DataSayer, Inc.
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Edward J. Yoon (@eddieyoon)
> >> >> Chief Executive Officer
> >> >> DataSayer, Inc.
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Edward J. Yoon (@eddieyoon)
> >> Chief Executive Officer
> >> DataSayer, Inc.
> >
> >
>
>
>
> --
> Edward J. Yoon (@eddieyoon)
> Chief Executive Officer
> DataSayer, Inc.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message