hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Why my distributed mode does not work?
Date Sat, 20 Apr 2013 12:58:08 GMT
P.S., please see our official wiki.

On Sat, Apr 20, 2013 at 9:34 PM, Edward J. Yoon <edwardyoon@apache.org> wrote:
> Yes, the repair function has deleted.
>
>
> On Sat, Apr 20, 2013 at 7:26 PM, Lyu Xuedong <lxd.1990@gmail.com> wrote:
>> Hi, Edward,
>>
>> Thank you.
>> I updated the version. There comes a new problem.
>>
>> In the /Apache Hama BSP Programming Model
>> (http://people.apache.org/~tjungblut/downloads/hamadocs/ApacheHamaBSPProgrammingmodel_06.pdf)/
>> there is a paragraph describes 'Graph repair' : "Hama requires a graph to be
>> completed before feeding it to an algorithm. By complete we mean that every
>> vertex that is referenced by an edge must somewhere be a vertex in the
>> graph. In many cases of leafs this is not always the case, therefore we have
>> added a repair functionality which is traversing the whole graph for leafs
>> and adding them to the vertex structure to prevent algorithms from breaking
>> with NullPointerExceptions when it does not find a referenced vertex. You
>> can turn this feature on by setting it in your configuration like this:
>> conf.setBoolean(GraphJobRunner.GRAPH_REPAIR, true);"
>>
>> I followed the guidance but got a hint 'GRAPH_REPAIR cannot be resolved or
>> is not a field' programming in Eclipse. I read the source code later and
>> there is really no variable named GRAPH_REPAIR in GraphJobRunner.java which
>> exists in the version 0.5.0. Is this function not supported any longer ? How
>> can I make a 'repaired graph'?
>> Thank you.
>>
>>
>> On 04/17/2013 12:51 PM, Edward J. Yoon wrote:
>>>
>>> Please use 0.6.1 and try your application with small data again.
>>>
>>> See also http://hama.apache.org/run_examples.html
>>>
>>> On Wed, Apr 17, 2013 at 10:27 AM, Lyu Xuedong <lxd.1990@gmail.com> wrote:
>>>>
>>>> hama: 0.6.0
>>>> hadoop : 1.0.4
>>>> JDK : 1.6
>>>> OS : ubuntu 12.04
>>>>
>>>>
>>>> On 04/17/2013 05:47 AM, Edward J. Yoon wrote:
>>>>>
>>>>> Your version?
>>>>>
>>>>> On Wed, Apr 17, 2013 at 12:07 AM, Lvxuedong <lxd.1990@gmail.com>
wrote:
>>>>>>
>>>>>> Hi, Edward, thank you. But your suggestion seems do not work, do
you
>>>>>> have
>>>>>> some other advice ?
>>>>>>
>>>>>> Is java.lang.NullPointerException related with heap size ?
>>>>>>
>>>>>>
>>>>>>
>>>>>> 在 2013-4-16,22:13,"Edward J. Yoon" <edwardyoon@apache.org>
写道:
>>>>>>
>>>>>>> I guess you need to increase the child processor JVM heap size.
>>>>>>>
>>>>>>> - conf/hama-site.xml:
>>>>>>>
>>>>>>>    <property>
>>>>>>>      <name>bsp.child.java.opts</name>
>>>>>>>      <value>-Xmx2048m</value>
>>>>>>>    </property>
>>>>>>>
>>>>>>> On Tue, Apr 16, 2013 at 10:55 PM, Lyu Xuedong <lxd.1990@gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> My project can work well in eclipse, but when I export it
as a jar
>>>>>>>> file
>>>>>>>> and
>>>>>>>> submit it to a two-node hama cluster, some errors and fatal
occur if
>>>>>>>> the
>>>>>>>> input file is over 64M.
>>>>>>>> I suspect that my configure files of hadoop or hama is somewhere
not
>>>>>>>> right,
>>>>>>>> but a pi estimator can run normally on my cluster and
>>>>>>>> GroomServer$BSPPerChild can be seen on each nodes. I debug
for a
>>>>>>>> whole
>>>>>>>> day,
>>>>>>>> nothing improved.
>>>>>>>> What in my input file are a large number of RDF triples:
"<subject>
>>>>>>>> <predicate> <object> ." My task is to create
vertices for subjects
>>>>>>>> and
>>>>>>>> objects. Predicates are subjects' edges.
>>>>>>>> What should I do ?
>>>>>>>>
>>>>>>>> Terminal output:
>>>>>>>>
>>>>>>>> 13/04/16 21:13:36 INFO bgp.HamaBgpComplete: Job begain.
>>>>>>>> 13/04/16 21:13:37 INFO bsp.FileInputFormat: Total input paths
to
>>>>>>>> process : 2
>>>>>>>> 13/04/16 21:13:38 INFO bsp.BSPJobClient: Running job:
>>>>>>>> job_201304161357_0015
>>>>>>>> 13/04/16 21:13:41 INFO bsp.BSPJobClient: Current supersteps
number: 0
>>>>>>>> 13/04/16 21:13:47 INFO bsp.BSPJobClient: Current supersteps
number: 2
>>>>>>>> 13/04/16 21:13:53 INFO bsp.BSPJobClient: Current supersteps
number: 3
>>>>>>>> 13/04/16 21:13:59 INFO bsp.BSPJobClient: Current supersteps
number: 4
>>>>>>>> 13/04/16 21:14:05 INFO bsp.BSPJobClient: Current supersteps
number: 5
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:13:43 INFO
>>>>>>>> sync.ZKSyncClient: Initializing ZK Sync Client
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:13:43 INFO
>>>>>>>> sync.ZooKeeperSyncClientImpl: Start connecting to Zookeeper!
At
>>>>>>>> hadoop1/1.2.3.4:61002
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:13:43 INFO
>>>>>>>> ipc.Server:
>>>>>>>> Starting SocketReader
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:13:43 INFO
>>>>>>>> ipc.Server:
>>>>>>>> IPC
>>>>>>>> Server Responder: starting
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:13:43 INFO
>>>>>>>> ipc.Server:
>>>>>>>> IPC
>>>>>>>> Server handler 0 on 61002: starting
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:13:43 INFO
>>>>>>>> message.HadoopMessageManagerImpl:  BSPPeer address:hadoop1
port:61002
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:13:43 INFO
>>>>>>>> ipc.Server:
>>>>>>>> IPC
>>>>>>>> Server listener on 61002: starting
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:14:03 ERROR
>>>>>>>> bsp.BSPTask:
>>>>>>>> Error running bsp setup and bsp function.
>>>>>>>> attempt_201304161357_0015_000000_0: java.lang.NullPointerException
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:14:04 INFO
>>>>>>>> ipc.Server:
>>>>>>>> Stopping server on 61002
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:14:04 INFO
>>>>>>>> ipc.Server:
>>>>>>>> IPC
>>>>>>>> Server handler 0 on 61002: exiting
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:14:04 INFO
>>>>>>>> ipc.Server:
>>>>>>>> Stopping IPC Server listener on 61002
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:14:04 INFO
>>>>>>>> ipc.Server:
>>>>>>>> Stopping IPC Server Responder
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:14:04 INFO
>>>>>>>> metrics.RpcInstrumentation: shut down
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:14:04 ERROR
>>>>>>>> bsp.BSPTask:
>>>>>>>> Shutting down ping service.
>>>>>>>> attempt_201304161357_0015_000000_0: 13/04/16 21:14:04 FATAL
>>>>>>>> bsp.GroomServer:
>>>>>>>> Error running child
>>>>>>>> attempt_201304161357_0015_000000_0: java.lang.NullPointerException
>>>>>>>> attempt_201304161357_0015_000000_0: java.lang.NullPointerException
>>>>>>>> 13/04/16 21:15:11 INFO bsp.BSPJobClient: Job failed.
>>>>>>>>
>>>>>>>> tasklogs:
>>>>>>>> attempt_201304161357_0015_000000_0.log
>>>>>>>> 13/04/16 21:13:43 INFO sync.ZKSyncClient: Initializing ZK
Sync Client
>>>>>>>> 13/04/16 21:13:43 INFO sync.ZooKeeperSyncClientImpl: Start
connecting
>>>>>>>> to
>>>>>>>> Zookeeper! At hadoop1/1.2.3.4:61002
>>>>>>>> 13/04/16 21:13:43 INFO ipc.Server: Starting SocketReader
>>>>>>>> 13/04/16 21:13:43 INFO ipc.Server: IPC Server Responder:
starting
>>>>>>>> 13/04/16 21:13:43 INFO ipc.Server: IPC Server handler 0 on
61002:
>>>>>>>> starting
>>>>>>>> 13/04/16 21:13:43 INFO message.HadoopMessageManagerImpl:
 BSPPeer
>>>>>>>> address:hadoop1 port:61002
>>>>>>>> 13/04/16 21:13:43 INFO ipc.Server: IPC Server listener on
61002:
>>>>>>>> starting
>>>>>>>> 13/04/16 21:14:03 ERROR bsp.BSPTask: Error running bsp setup
and bsp
>>>>>>>> function.
>>>>>>>> java.lang.NullPointerException
>>>>>>>> 13/04/16 21:14:04 INFO ipc.Server: Stopping server on 61002
>>>>>>>> 13/04/16 21:14:04 INFO ipc.Server: IPC Server handler 0 on
61002:
>>>>>>>> exiting
>>>>>>>> 13/04/16 21:14:04 INFO ipc.Server: Stopping IPC Server listener
on
>>>>>>>> 61002
>>>>>>>> 13/04/16 21:14:04 INFO ipc.Server: Stopping IPC Server Responder
>>>>>>>> 13/04/16 21:14:04 INFO metrics.RpcInstrumentation: shut down
>>>>>>>> 13/04/16 21:14:04 ERROR bsp.BSPTask: Shutting down ping service.
>>>>>>>> 13/04/16 21:14:04 FATAL bsp.GroomServer: Error running child
>>>>>>>> java.lang.NullPointerException
>>>>>>>> java.lang.NullPointerException
>>>>>>>>
>>>>>>>> attempt_201304161357_0015_000001_0.log
>>>>>>>> 13/04/16 21:13:42 INFO sync.ZKSyncClient: Initializing ZK
Sync Client
>>>>>>>> 13/04/16 21:13:42 INFO sync.ZooKeeperSyncClientImpl: Start
connecting
>>>>>>>> to
>>>>>>>> Zookeeper! At hadoop1/1.2.3.4:61001
>>>>>>>> 13/04/16 21:13:42 ERROR sync.ZooKeeperSyncClientImpl:
>>>>>>>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
>>>>>>>> =
>>>>>>>> NoNode for /bsp/job_201304161357_0015/peers
>>>>>>>> 13/04/16 21:13:42 INFO ipc.Server: Starting SocketReader
>>>>>>>> 13/04/16 21:13:42 INFO ipc.Server: IPC Server Responder:
starting
>>>>>>>> 13/04/16 21:13:42 INFO message.HadoopMessageManagerImpl:
 BSPPeer
>>>>>>>> address:hadoop1 port:61001
>>>>>>>> 13/04/16 21:13:42 INFO ipc.Server: IPC Server listener on
61001:
>>>>>>>> starting
>>>>>>>> 13/04/16 21:13:42 INFO ipc.Server: IPC Server handler 0 on
61001:
>>>>>>>> starting
>>>>>>>> 13/04/16 21:14:06 ERROR bsp.BSPPeerImpl: Error while sending
messages
>>>>>>>> java.io.IOException: Call to hadoop1/1.2.3.4:61002 failed
on local
>>>>>>>> exception: java.io.EOFException
>>>>>>>>      at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)
>>>>>>>>      at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>>>      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>>      at $Proxy3.put(Unknown Source)
>>>>>>>>      at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hama.bsp.message.HadoopMessageManagerImpl.transfer(HadoopMessageManagerImpl.java:108)
>>>>>>>>      at org.apache.hama.bsp.BSPPeerImpl.sync(BSPPeerImpl.java:410)
>>>>>>>>      at
>>>>>>>> org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:118)
>>>>>>>>      at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:166)
>>>>>>>>      at org.apache.hama.bsp.BSPTask.run(BSPTask.java:143)
>>>>>>>>      at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1266)
>>>>>>>> Caused by: java.io.EOFException
>>>>>>>>      at java.io.DataInputStream.readInt(DataInputStream.java:375)
>>>>>>>>      at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800)
>>>>>>>>      at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards, Edward J. Yoon
>>>>>>> @eddieyoon
>>>>>
>>>>>
>>>>>
>>>
>>>
>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message