hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ed Mazur <ma...@cs.umass.edu>
Subject Re: hadoop under cygwin issue
Date Wed, 03 Feb 2010 23:07:24 GMT
Brian,

It looks like you're confusing your local file system with HDFS. HDFS
sits on top of your file system and is where data for (non-standalone)
Hadoop jobs comes from. You can poll it with "fs -ls ...", so do
something like "hadoop fs -lsr /" to see everything in HDFS. This will
probably shed some light on why your first attempt failed.
/user/brian/input should be a directory with several xml files.

Ed

On Wed, Feb 3, 2010 at 5:17 PM, Brian Wolf <brw314@gmail.com> wrote:
> Alex Kozlov wrote:
>>
>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :      
0
>>
>> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs directory
>> (or where your logs are) and check the errors.
>>
>> Alex K
>>
>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <brw314@gmail.com> wrote:
>>
>>
>
>
>
> Thanks for your help, Alex,
>
> I managed to get past that problem, now I have this problem:
>
> However, when I try to run this example as stated on the quickstart webpage:
>
> bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a-z.]+'
>
> I get this error;
> =============================================================
> java.io.IOException:       Not a file:
> hdfs://localhost:9000/user/brian/input/conf
> =========================================================
> so it seems to default to my home directory looking for "input" it
> apparently  needs an absolute filepath, however, when I  run that way:
>
> $ bin/hadoop jar hadoop-*-examples.jar grep /usr/local/hadoop-0.19.2/input
>  output 'dfs[a-z.]+'
>
> ==============================================================
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
> ==============================================================
> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>  <-  does exist
>>>
>>> Aaron,
>>>
>>> Thanks or your help. I  carefully went through the steps again a couple
>>> times , and ran
>>>
>>> after this
>>> bin/hadoop namenode -format
>>>
>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>
>>>
>>> then
>>>
>>>
>>> bin/start-dfs.sh
>>>
>>> and
>>>
>>> bin/start-all.sh
>>>
>>>
>>> and then
>>> bin/hadoop fs -put conf input
>>>
>>> now the return for this seemed cryptic:
>>>
>>>
>>> put: Target input/conf is a directory
>>>
>>> (??)
>>>
>>>  and when I tried
>>>
>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>
>>> It says something about 0 nodes
>>>
>>> (from log file)
>>>
>>> 2010-02-01 13:26:29,874 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>
>>>  src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>  dst=null    perm=brian:supergroup:rw-r--r--
>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 3 on 9000, call
>>>
>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>> File
>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> could
>>> only be replicated to 0 nodes, instead of 1
>>> java.io.IOException: File
>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> could
>>> only be replicated to 0 nodes, instead of 1
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>  at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>  at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>
>>>
>>>
>>>
>>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>>
>>>  TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>>  TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>>
>>>
>>> and when I browse to http://localhost:50070/
>>>
>>>
>>>    Cluster Summary
>>>
>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB
>>> /
>>> 992.31 MB (0%)
>>> *
>>> Configured Capacity     :       0 KB
>>> DFS Used        :       0 KB
>>> Non DFS Used    :       0 KB
>>> DFS Remaining   :       0 KB
>>> DFS Used%       :       100 %
>>> DFS Remaining%  :       0 %
>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :  
    0
>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :  
    0
>>>
>>>
>>> so I'm a bit still in the dark, I guess.
>>>
>>> Thanks
>>> Brian
>>>
>>>
>>>
>>>
>>> Aaron Kimball wrote:
>>>
>>>
>>>>
>>>> Brian, it looks like you missed a step in the instructions. You'll need
>>>> to
>>>> format the hdfs filesystem instance before starting the NameNode server:
>>>>
>>>> You need to run:
>>>>
>>>> $ bin/hadoop namenode -format
>>>>
>>>> .. then you can do bin/start-dfs.sh
>>>> Hope this helps,
>>>> - Aaron
>>>>
>>>>
>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <brw314@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>>> hadoop "quickstart" web page.
>>>>>
>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>
>>>>> This is from my hadoop-site.xml
>>>>>
>>>>> <configuration>
>>>>> <property>
>>>>> <name>hadoop.tmp.dir</name>
>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>fs.default.name</name>
>>>>> <value>hdfs://localhost:9000</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>mapred.job.tracker</name>
>>>>> <value>localhost:9001</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>> <value>-1</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>dfs.replication</name>
>>>>> <value>1</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>dfs.permissions</name>
>>>>> <value>false</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>webinterface.private.actions</name>
>>>>> <value>true</value>
>>>>> </property>
>>>>> </configuration>
>>>>>
>>>>> These are errors from my log files:
>>>>>
>>>>>
>>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>> 2010-01-30 00:03:33,121 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>> localhost/
>>>>> 127.0.0.1:9000
>>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>> 2010-01-30 00:03:33,181 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>> Initializing
>>>>> NameNodeMeterics using context
>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> fsOwner=brian,None,Administrators,Users
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> supergroup=supergroup
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> isPermissionEnabled=false
>>>>> 2010-01-30 00:03:34,653 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>> Initializing FSNamesystemMetrics using context
>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>> 2010-01-30 00:03:34,653 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>> FSNamesystemStatusMBean
>>>>> 2010-01-30 00:03:34,803 INFO
>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>> initialization failed.
>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>> state:
>>>>> storage directory does not exist or is not accessible.
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>  at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>  at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>  at
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>> server
>>>>> on 9000
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> =========================================================
>>>>>
>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>> connect
>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>> problem cleaning system directory: null
>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>>> connection exception: java.net.ConnectException: Connection refused:
no
>>>>> further information
>>>>>  at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>  at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>  at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>  at
>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Brian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Mime
View raw message