hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roman Shapovalov <shapova...@graphics.cs.msu.su>
Subject Re: Problem initializing pipes in HamaStreaming
Date Fri, 27 Sep 2013 10:55:52 GMT
Martin,

> then you don't have started hdfs?

I have not started it manually, but it has been active:

NameNode '0.0.0.0:8020' (active)
Started:Wed Sep 25 18:54:42 EDT 2013

> Your hdfs should contain the following files:

It does.

> Without the default file system in hama-site.xml, it will not work.

Well, at least Hama (without streaming) worked, using the local file system.
It seems Streaming could not find the Python files, since it searched
them in the local file system.

Roman

On Fri, Sep 27, 2013 at 6:30 AM, Martin Illecker <millecker@apache.org> wrote:
> Hi Roman,
>
> then you don't have started hdfs? (start-dfs.sh)
>
> Are you able to access the hdfs namenode?
> http://localhost:50070/dfshealth.jsp
>
> Your hdfs should contain the following files:
>
> $hadoop fs -ls /tmp/PyStreaming/
> Found 8 items
> -rw-r--r--   279 2013-09-27 12:19 /tmp/PyStreaming/BSP.py
> -rw-r--r--   5159 2013-09-27 12:19 /tmp/PyStreaming/BSPPeer.py
> -rw-r--r--   379 2013-09-27 12:19 /tmp/PyStreaming/BSPRunner.py
> -rw-r--r--   970 2013-09-27 12:19 /tmp/PyStreaming/BinaryProtocol.py
> -rw-r--r--   299 2013-09-27 12:19 /tmp/PyStreaming/BspJobConfiguration.py
> -rw-r--r--   557 2013-09-27 12:19 /tmp/PyStreaming/HelloWorldBSP.py
> -rw-r--r--   5570 2013-09-27 12:19 /tmp/PyStreaming/KMeansBSP.py
> -rw-r--r--   326 2013-09-27 12:19 /tmp/PyStreaming/README
>
> Without the default file system in hama-site.xml, it will not work.
>
> Martin
>
>
> 2013/9/27 Roman Shapovalov <shapovalov@graphics.cs.msu.su>
>
>> Martin,
>>
>> if I set default file system to hdfs://localhost/, I get the connection
>> error:
>>
>> 13/09/27 14:04:11 INFO ipc.Client: Retrying connect to server:
>> localhost/127.0.0.1:40000. Already tried 0 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1
>> SECONDS)
>>
>> (and 10 times like that, than get a java.net.ConnectException).
>>
>> I attach the hama-site.xml (as it was before adding the default fs
>> property). I had only added the bsp.master.address property to switch
>> to the PDM.
>>
>> Roman
>>
>> On Fri, Sep 27, 2013 at 4:20 AM, Martin Illecker <martin@illecker.at>
>> wrote:
>> > Hi Roman!
>> >
>> > Did you setup the default filesystem in hama-site.xml?
>> >
>> > Please submit your hama-site.xml configuration.
>> >
>> > Martin
>> >
>> >
>> > hama-site.xml - pseudo-distributed mode
>> >
>> > <configuration>
>> >
>> >     <property>
>> >         <name>bsp.master.address</name>
>> >         <value>localhost:40000</value>
>> >         <description>The address of the bsp master server. Either the
>> >             literal string "local" or a host:port for distributed mode
>> >         </description>
>> >     </property>
>> >
>> >     <property>
>> >         <name>fs.default.name</name>
>> >         <value>hdfs://localhost/</value>
>> >         <description>
>> >             The name of the default file system. Either the literal
>> string
>> >             "local" or a host:port for HDFS.
>> >         </description>
>> >     </property>
>> >
>> >     <property>
>> >         <name>hama.zookeeper.quorum</name>
>> >         <value>localhost</value>
>> >         <description>Comma separated list of servers in the ZooKeeper
>> Quorum.
>> >             For example, "host1.mydomain.com,host2.mydomain.com,
>> host3.mydomain.com".
>> >             By default this is set to localhost for local and
>> pseudo-distributed modes
>> >             of operation. For a fully-distributed setup, this should be
>> set to a full
>> >             list of ZooKeeper quorum servers. If HAMA_MANAGES_ZK is set
>> in hama-env.sh
>> >             this is the list of servers which we will start/stop
>> zookeeper on.
>> >         </description>
>> >     </property>
>> >
>> > </configuration>
>> >
>> >
>> > Am 27.09.2013 um 09:32 schrieb Roman Shapovalov <
>> shapovalov@graphics.cs.msu.su>:
>> >
>> >> Edward,
>> >>
>> >> Yes, I did. See the logs in my previous message.
>> >>
>> >> Roman
>> >>
>> >> On Fri, Sep 27, 2013 at 7:15 AM, Edward J. Yoon <edwardyoon@apache.org>
>> wrote:
>> >>> Have you tried to run in pseudo-distributed mode?
>> >>>
>> >>> On Fri, Sep 27, 2013 at 5:47 AM, Roman Shapovalov
>> >>> <shapovalov@graphics.cs.msu.su> wrote:
>> >>>> Martin,
>> >>>>
>> >>>> Thanks for such verbose instructions.
>> >>>>
>> >>>>> You can find all Hama configuration files in the *conf* folder.
>> >>>>
>> >>>> OK, I thought Edward meant Hadoop configs specifically.
>> >>>> I have only added JAVA_HOME variable there, otherwise they are
>> default.
>> >>>>
>> >>>>> You should also find task logs in your *temp* folder.
>> >>>>
>> >>>> I found the folder, but there were no .log files in the attempt*
>> >>>> folders (in both modes).
>> >>>>
>> >>>>> Normally you should find it in *hama/logs/tasklogs*.
>> >>>>
>> >>>> They appear in the pseudo-distributed mode only (which also fails).
>> >>>> See the attached file.
>> >>>>
>> >>>>> By the way do you have python3.2 installed? :-)
>> >>>>
>> >>>> Yes. "python" links to Python 2.6, but I pass "python3.2" as an
>> >>>> interpreter, which links to the correct version.
>> >>>>
>> >>>>
>> >>>> Roman
>> >>>>
>> >>>> On Thu, Sep 26, 2013 at 4:03 PM, Martin Illecker <
>> millecker@apache.org> wrote:
>> >>>>> Hi Roman,
>> >>>>>
>> >>>>> if you are running Hama in local mode, it will not use HDFS
anyway.
>> >>>>>
>> >>>>> You can find all Hama configuration files in the *conf* folder.
>> >>>>>
>> >>>>> $ll hama/conf/
>> >>>>> total 56
>> >>>>> -rwxr-xr-x groomservers*
>> >>>>> -rwxr-xr-x hama-default.xml*
>> >>>>> -rwxr-xr-x hama-env.sh*
>> >>>>> -rwxr-xr-x hama-site.xml*
>> >>>>> -rwxr-xr-x log4j.properties*
>> >>>>>
>> >>>>> Probably you should setup the Pseudo Distributed Mode [1] in
>> hama-site.xml.
>> >>>>>
>> >>>>> But the task log would be very interesting.
>> >>>>>
>> >>>>> Normally you should find it in *hama/logs/tasklogs*.
>> >>>>> e.g.,
>> hama/logs/tasklogs/job_201309262134_0001/attempt_201309262134_0001_000000_0.log
>> >>>>>
>> >>>>> You should also find task logs in your *temp* folder.
>> >>>>> But this location will depend on your operation system.
>> >>>>> e.g., in OSX
>> >>>>>
>> /private/tmp/hadoop-YOURUSER/bsp/local/groomServer/attempt_201309262134_0001_000000_0/work/tasklogs/
>> >>>>>
>> >>>>> By the way do you have python3.2 installed? :-)
>> >>>>> $ python --version
>> >>>>> Python 3.2.5
>> >>>>> $ python3.2 --version
>> >>>>> Python 3.2.5
>> >>>>>
>> >>>>> May I ask which operation system do you use?
>> >>>>>
>> >>>>> Martin
>> >>>>>
>> >>>>> [1]
>> http://wiki.apache.org/hama/GettingStarted#Pseudo_Distributed_Mode
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> 2013/9/26 Roman Shapovalov <shapovalov@graphics.cs.msu.su>
>> >>>>>
>> >>>>>> Hi Edward,
>> >>>>>>
>> >>>>>> Could you please be more specific? (Sorry, I am new to this
stuff)
>> >>>>>>
>> >>>>>> I run Hama in local mode. The logs/ directory is empty,
and I did
>> not
>> >>>>>> find any logs in HDFS as well.
>> >>>>>>
>> >>>>>> And where can I find the Hadoop configuration?
>> >>>>>>
>> >>>>>> Thank you,
>> >>>>>> Roman
>> >>>>>>
>> >>>>>> On Thu, Sep 26, 2013 at 12:05 PM, Edward J. Yoon <
>> edwardyoon@apache.org>
>> >>>>>> wrote:
>> >>>>>>> Hi,
>> >>>>>>>
>> >>>>>>> That's strange. Can you attach your namenode logs and
hadoop
>> >>>>>> configurations?
>> >>>>>>>
>> >>>>>>> On Thu, Sep 26, 2013 at 11:03 PM, Roman Shapovalov
>> >>>>>>> <shapovalov@graphics.cs.msu.su> wrote:
>> >>>>>>>> Hi again,
>> >>>>>>>>
>> >>>>>>>> I have updated both Hama (from the trunk) and Streaming
(from
>> Martin's
>> >>>>>>>> github), and checked that patches have been applied,
but I keep
>> >>>>>>>> getting the same error (full log for local configuration
is
>> attached).
>> >>>>>>>>
>> >>>>>>>> Another thing may be relevant: I keep the default
Hadoop
>> libraries in
>> >>>>>>>> lib/. If I replace them as the tutorial says, some
classes cannot
>> be
>> >>>>>>>> found even if  I run pure Hama (which works perfectly
with default
>> >>>>>>>> libs). I don't know if it is important.
>> >>>>>>>>
>> >>>>>>>> Thanks,
>> >>>>>>>> Roman
>> >>>>>>>>
>> >>>>>>>> On Tue, Sep 24, 2013 at 9:22 AM, Martin Illecker
<
>> millecker@apache.org>
>> >>>>>> wrote:
>> >>>>>>>>> Hi Roman,
>> >>>>>>>>>
>> >>>>>>>>> sorry for inconvenience!
>> >>>>>>>>> The problem has been reported [1] and will be
fixed shortly to
>> the
>> >>>>>> trunk.
>> >>>>>>>>>
>> >>>>>>>>> [1] https://issues.apache.org/jira/browse/HAMA-805
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> 2013/9/23 Edward J. Yoon <edwardyoon@apache.org>
>> >>>>>>>>>
>> >>>>>>>>>> This looks like a bug of DistCacheUtils.
>> >>>>>>>>>>
>> >>>>>>>>>> Thanks for your report. I'll look at it
tomorrow.
>> >>>>>>>>>>
>> >>>>>>>>>> On Mon, Sep 23, 2013 at 11:52 PM, Roman
Shapovalov
>> >>>>>>>>>> <shapovalov@graphics.cs.msu.su> wrote:
>> >>>>>>>>>>> Hello all,
>> >>>>>>>>>>>
>> >>>>>>>>>>> I try to use Hama Streaming.
>> >>>>>>>>>>> I have successfully installed Hama (the
Pi example works).
>> >>>>>>>>>>> I follow this tutorial:
>> >>>>>>>>>>> http://wiki.apache.org/hama/HamaStreaming
>> >>>>>>>>>>>
>> >>>>>>>>>>> When I try to run the distributed HelloWorld
in the local
>> >>>>>>>>>>> configuration, I get the following error:
>> >>>>>>>>>>>
>> >>>>>>>>>>> $ bin/hama pipes -streaming true -bspTasks
3 -interpreter
>> python3.2
>> >>>>>>>>>>> -cachefiles /tmp/PyStreaming/*.py -output
/tmp/pystream-out/
>> >>>>>> -program
>> >>>>>>>>>>> /tmp/PyStreaming/BSPRunner.py -programArgs
HelloWorldBSP
>> >>>>>>>>>>>
>> >>>>>>>>>>> 13/09/23 18:03:50 INFO pipes.Submitter:
Streaming enabled!
>> >>>>>>>>>>> 13/09/23 18:03:50 WARN util.NativeCodeLoader:
Unable to load
>> >>>>>>>>>>> native-hadoop library for your platform...
using builtin-java
>> >>>>>> classes
>> >>>>>>>>>>> where applicable
>> >>>>>>>>>>> 13/09/23 18:03:50 WARN bsp.BSPJobClient:
No job jar file set.
>>  User
>> >>>>>>>>>>> classes may not be found. See BSPJob#setJar(String)
or check
>> Your
>> >>>>>> jar
>> >>>>>>>>>>> file.
>> >>>>>>>>>>> 13/09/23 18:03:50 INFO bsp.BSPJobClient:
Running job:
>> >>>>>>>>>> job_localrunner_0001
>> >>>>>>>>>>> 13/09/23 18:03:50 INFO bsp.LocalBSPRunner:
Setting up a new
>> barrier
>> >>>>>> for
>> >>>>>>>>>> 3 tasks!
>> >>>>>>>>>>> 13/09/23 18:03:50 ERROR bsp.LocalBSPRunner:
Exception during
>> BSP
>> >>>>>>>>>> execution!
>> >>>>>>>>>>> java.lang.NullPointerException
>> >>>>>>>>>>>    at org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:44)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>
>> >>>>>>
>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:255)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>
>> >>>>>>
>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:286)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>
>> >>>>>>
>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:211)
>> >>>>>>>>>>>    at
>> >>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >>>>>>>>>>>    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>
>> >>>>>>
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>> >>>>>>>>>>>    at
>> >>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >>>>>>>>>>>    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>
>> >>>>>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>
>> >>>>>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >>>>>>>>>>>    at java.lang.Thread.run(Thread.java:662)
>> >>>>>>>>>>> [output cropped]
>> >>>>>>>>>>>
>> >>>>>>>>>>> When I turn to the pseudo-distributed
mode, job fails too
>> (after a
>> >>>>>>>>>>> minute of execution):
>> >>>>>>>>>>>
>> >>>>>>>>>>> 13/09/23 18:46:34 INFO pipes.Submitter:
Streaming enabled!
>> >>>>>>>>>>> 13/09/23 18:46:34 WARN util.NativeCodeLoader:
Unable to load
>> >>>>>>>>>>> native-hadoop library for your platform...
using builtin-java
>> >>>>>> classes
>> >>>>>>>>>>> where applicable
>> >>>>>>>>>>> 13/09/23 18:46:34 WARN bsp.BSPJobClient:
No job jar file set.
>>  User
>> >>>>>>>>>>> classes may not be found. See BSPJob#setJar(String)
or check
>> Your
>> >>>>>> jar
>> >>>>>>>>>>> file.
>> >>>>>>>>>>> 13/09/23 18:46:34 INFO bsp.BSPJobClient:
Running job:
>> >>>>>>>>>> job_201309231846_0001
>> >>>>>>>>>>> 13/09/23 18:47:40 INFO bsp.BSPJobClient:
Job failed.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Task log contains errors:
>> >>>>>>>>>>>
>> >>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: Starting
Socket Reader #1
>> for
>> >>>>>> port
>> >>>>>>>>>> 43475
>> >>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: IPC
Server Responder:
>> starting
>> >>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: IPC
Server listener on
>> 43475:
>> >>>>>> starting
>> >>>>>>>>>>> 13/09/23 18:46:37 INFO message.HadoopMessageManagerImpl:
>>  BSPPeer
>> >>>>>>>>>>> address:localhost.localdomain port:43475
>> >>>>>>>>>>> 13/09/23 18:46:37 INFO ipc.Server: IPC
Server handler 0 on
>> 43475:
>> >>>>>>>>>> starting
>> >>>>>>>>>>> 13/09/23 18:46:37 WARN util.NativeCodeLoader:
Unable to load
>> >>>>>>>>>>> native-hadoop library for your platform...
using builtin-java
>> >>>>>> classes
>> >>>>>>>>>>> where applicable
>> >>>>>>>>>>> 13/09/23 18:46:37 INFO sync.ZKSyncClient:
Initializing ZK Sync
>> >>>>>> Client
>> >>>>>>>>>>> 13/09/23 18:46:37 INFO sync.ZooKeeperSyncClientImpl:
Start
>> >>>>>> connecting
>> >>>>>>>>>>> to Zookeeper! At localhost.localdomain/127.0.0.1:43475
>> >>>>>>>>>>> 13/09/23 18:46:37 ERROR bsp.BSPTask:
Error running bsp setup
>> and bsp
>> >>>>>>>>>> function.
>> >>>>>>>>>>> java.lang.NullPointerException
>> >>>>>>>>>>>    at java.io.File.<init>(File.java:222)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>
>> >>>>>>
>> org.apache.hama.pipes.PipesApplication.setupCommand(PipesApplication.java:130)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>
>> >>>>>>
>> org.apache.hama.pipes.PipesApplication.start(PipesApplication.java:257)
>> >>>>>>>>>>>    at org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:44)
>> >>>>>>>>>>>    at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:176)
>> >>>>>>>>>>>    at org.apache.hama.bsp.BSPTask.run(BSPTask.java:146)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>
>> >>>>>>
>> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1246)
>> >>>>>>>>>>> [output cropped]
>> >>>>>>>>>>>
>> >>>>>>>>>>> I use the latest trunk version of Hama,
Python 3.2.5 and Hadoop
>> >>>>>>>>>> 2.0.0-cdh4.1.1.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Please help me to figure out the problem.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thanks in advance,
>> >>>>>>>>>>> Roman
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> --
>> >>>>>>>>>> Best Regards, Edward J. Yoon
>> >>>>>>>>>> @eddieyoon
>> >>>>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Best Regards, Edward J. Yoon
>> >>>>>>> @eddieyoon
>> >>>>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Best Regards, Edward J. Yoon
>> >>> @eddieyoon
>> >
>>

Mime
View raw message