hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Tariq <donta...@gmail.com>
Subject Re: Error putting files in the HDFS
Date Tue, 08 Oct 2013 20:08:52 GMT
You are welcome Basu.

Not a problem. You can use *bin/hadoop fs -lsr /* to list down all the HDFS
files and directories. See which files are no longer required and delete
them using *bin/hadoop fs -rm /path/to/the/file*

Warm Regards,
Tariq
cloudfront.blogspot.com


On Tue, Oct 8, 2013 at 11:59 PM, Basu,Indrashish <indrashish@ufl.edu> wrote:

> **
>
>
>
> Hi Tariq,
>
> Thanks a lot for your help.
>
> Can you please let me know the path where I can check the old files in the
> HDFS and remove them accordingly. I am sorry to bother with these
> questions, I am absolutely new to Hadoop.
>
> Thanks again for your time and pateince.
>
>
>
> Regards,
>
> Indrashish
>
>
>
>
>
> On Tue, 8 Oct 2013 23:51:30 +0530, Mohammad Tariq wrote:
>
> You don't have any more space left in your HDFS. Delete some old data or
> add additional storage.
>
>  Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Oct 8, 2013 at 11:47 PM, Basu,Indrashish <indrashish@ufl.edu>wrote:
>
>>
>>
>> Hi ,
>>
>> Just to update on this, I have deleted all the old logs and files from
>> the /tmp and /app/hadoop directory, and restarted all the nodes, I have now
>> 1 datanode available as per the below information :
>>
>> Configured Capacity: 3665985536 (3.41 GB)
>> Present Capacity: 24576 (24 KB)
>>
>> DFS Remaining: 0 (0 KB)
>> DFS Used: 24576 (24 KB)
>> DFS Used%: 100%
>>
>> Under replicated blocks: 0
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 10.227.56.195:50010
>> Decommission Status : Normal
>> Configured Capacity: 3665985536 (3.41 GB)
>> DFS Used: 24576 (24 KB)
>> Non DFS Used: 3665960960 (3.41 GB)
>> DFS Remaining: 0(0 KB)
>> DFS Used%: 0%
>> DFS Remaining%: 0%
>> Last contact: Tue Oct 08 11:12:19 PDT 2013
>>
>>
>> However when I tried putting the files back in HDFS, I am getting the
>> same error as stated earlier. Do I need to clear some space for the HDFS ?
>>
>> Regards,
>> Indrashish
>>
>>
>>
>> On Tue, 08 Oct 2013 14:01:19 -0400, Basu,Indrashish wrote:
>>
>>> Hi Jitendra,
>>>
>>> This is what I am getting in the datanode logs :
>>>
>>> 2013-10-07 11:27:41,960 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>> /app/hadoop/tmp/dfs/data is not formatted.
>>> 2013-10-07 11:27:41,961 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>> 2013-10-07 11:27:42,094 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>> FSDatasetStatusMBean
>>> 2013-10-07 11:27:42,099 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at
>>> 50010
>>> 2013-10-07 11:27:42,107 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is
>>> 1048576 bytes/s
>>> 2013-10-07 11:27:42,369 INFO org.mortbay.log: Logging to
>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> org.mortbay.log.Slf4jLog
>>> 2013-10-07 11:27:42,632 INFO org.apache.hadoop.http.HttpServer: Port
>>> returned by webServer.getConnectors()[0].getLocalPort() before open()
>>> is -1. Opening the listener on 50075
>>> 2013-10-07 11:27:42,633 INFO org.apache.hadoop.http.HttpServer:
>>> listener.getLocalPort() returned 50075
>>> webServer.getConnectors()[0].getLocalPort() returned 50075
>>> 2013-10-07 11:27:42,634 INFO org.apache.hadoop.http.HttpServer: Jetty
>>> bound to port 50075
>>> 2013-10-07 11:27:42,634 INFO org.mortbay.log: jetty-6.1.14
>>> 2013-10-07 11:31:29,821 INFO org.mortbay.log: Started
>>> SelectChannelConnector@0.0.0.0:50075
>>> 2013-10-07 11:31:29,843 INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics
>>> with processName=DataNode, sessionId=null
>>> 2013-10-07 11:31:29,912 INFO
>>> org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics
>>> with hostName=DataNode, port=50020
>>> 2013-10-07 11:31:29,922 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> Responder: starting
>>> 2013-10-07 11:31:29,922 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> listener on 50020: starting
>>> 2013-10-07 11:31:29,933 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 50020: starting
>>> 2013-10-07 11:31:29,933 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 50020: starting
>>> 2013-10-07 11:31:29,933 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 2 on 50020: starting
>>> 2013-10-07 11:31:29,934 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration =
>>> DatanodeRegistration(tegra-ubuntu:50010, storageID=, infoPort=50075,
>>> ipcPort=50020)
>>> 2013-10-07 11:31:29,971 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: New storage id
>>> DS-1027334635-127.0.1.1-50010-1381170689938 is assigned to data-node
>>> 10.227.56.195:50010
>>> 2013-10-07 11:31:29,973 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>>> DatanodeRegistration(10.227.56.195:50010,
>>> storageID=DS-1027334635-127.0.1.1-50010-1381170689938, infoPort=50075,
>>> ipcPort=50020)In DataNode.run, data = FSDataset
>>> {dirpath='/app/hadoop/tmp/dfs/data/current'}
>>> 2013-10-07 11:31:29,974 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: using
>>> BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec
>>> 2013-10-07 11:31:30,032 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
>>> blocks got processed in 19 msecs
>>> 2013-10-07 11:31:30,035 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic
>>> block scanner.
>>> 2013-10-07 11:41:42,222 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
>>> blocks got processed in 20 msecs
>>> 2013-10-07 12:41:43,482 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
>>> blocks got processed in 22 msecs
>>> 2013-10-07 13:41:44,755 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
>>> blocks got processed in 13 msecs
>>>
>>>
>>> I restarted the datanode and made sure that it is up and running
>>> (typed jps command).
>>>
>>> Regards,
>>> Indrashish
>>>
>>> On Tue, 8 Oct 2013 23:25:25 +0530, Jitendra Yadav wrote:
>>>
>>>> As per your dfs report, available DataNodes  count is ZERO  in you
>>>> cluster.
>>>>
>>>> Please check your data node logs.
>>>>
>>>> Regards
>>>> Jitendra
>>>>
>>>> On 10/8/13, Basu,Indrashish <indrashish@ufl.edu> wrote:
>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> My name is Indrashish Basu and I am a Masters student in the Department
>>>>> of Electrical and Computer Engineering.
>>>>>
>>>>> Currently I am doing my research project on Hadoop implementation on
>>>>> ARM processor and facing an issue while trying to run a sample Hadoop
>>>>> source code on the same. Every time I am trying to put some files in
>>>>> the
>>>>> HDFS, I am getting the below error.
>>>>>
>>>>>
>>>>> 13/10/07 11:31:29 WARN hdfs.DFSClient: DataStreamer Exception:
>>>>> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
>>>>> /user/root/bin/cpu-kmeans2D could only be replicated to 0 nodes,
>>>>> instead
>>>>> of 1
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.
>>>>> getAdditionalBlock(FSNamesystem.java:1267)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(
>>>>> NameNode.java:422)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>>
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(
>>>>> NativeMethodAccessorImpl.java:57)
>>>>> at
>>>>>
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>>>>> DelegatingMethodAccessorImpl.java:43)
>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>>>>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>>>>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>>>>>
>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:739)
>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>>>>> at com.sun.proxy.$Proxy0.addBlock(Unknown Source)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>>
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(
>>>>> NativeMethodAccessorImpl.java:57)
>>>>> at
>>>>>
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>>>>> DelegatingMethodAccessorImpl.java:43)
>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(
>>>>> RetryInvocationHandler.java:82)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
>>>>> RetryInvocationHandler.java:59)
>>>>> at com.sun.proxy.$Proxy0.addBlock(Unknown Source)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(
>>>>> DFSClient.java:2904)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.
>>>>> nextBlockOutputStream(DFSClient.java:2786)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.
>>>>> access$2000(DFSClient.java:2076)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$
>>>>> DataStreamer.run(DFSClient.java:2262)
>>>>>
>>>>> 13/10/07 11:31:29 WARN hdfs.DFSClient: Error Recovery for block null
>>>>> bad datanode[0] nodes == null
>>>>> 13/10/07 11:31:29 WARN hdfs.DFSClient: Could not get block locations.
>>>>> Source file "/user/root/bin/cpu-kmeans2D" - Aborting...
>>>>> put: java.io.IOException: File /user/root/bin/cpu-kmeans2D could only
>>>>> be replicated to 0 nodes, instead of 1
>>>>>
>>>>>
>>>>> I tried replicating the namenode and datanode by deleting all the old
>>>>> logs on the master and the slave nodes as well as the folders under
>>>>> /app/hadoop/, after which I formatted the namenode and started the
>>>>> process again (bin/start-all.sh), but still no luck with the same.
>>>>>
>>>>> I tried generating the admin report(pasted below) after doing the
>>>>> restart, it seems the data node is not getting started.
>>>>>
>>>>> -------------------------------------------------
>>>>> Datanodes available: 0 (0 total, 0 dead)
>>>>>
>>>>> root@tegra-ubuntu:~/hadoop-gpu-master/hadoop-gpu-0.20.1# bin/hadoop
>>>>> dfsadmin -report
>>>>> Configured Capacity: 0 (0 KB)
>>>>> Present Capacity: 0 (0 KB)
>>>>> DFS Remaining: 0 (0 KB)
>>>>> DFS Used: 0 (0 KB)
>>>>> DFS Used%: �%
>>>>> Under replicated blocks: 0
>>>>> Blocks with corrupt replicas: 0
>>>>> Missing blocks: 0
>>>>>
>>>>> -------------------------------------------------
>>>>> Datanodes available: 0 (0 total, 0 dead)
>>>>>
>>>>>
>>>>> I have tried the following methods to debug the process :
>>>>>
>>>>> 1) I logged in to the HADOOP home directory and removed all the old
>>>>> logs (rm -rf logs/*)
>>>>>
>>>>> 2) Next I deleted the contents of the directory on all my slave and
>>>>> master nodes (rm -rf /app/hadoop/*)
>>>>>
>>>>> 3) I formatted the namenode (bin/hadoop namenode -format)
>>>>>
>>>>> 4) I started all the processes - first the namenode, datanode and then
>>>>> the map - reduce. I typed jps on the terminal to ensure that all the
>>>>> processes (Namenode, Datanode, JobTracker, Task Tracker) are up and
>>>>> running.
>>>>>
>>>>> 5) Now doing this, I recreated the directories in the dfs.
>>>>>
>>>>> However still no luck with the process.
>>>>>
>>>>>
>>>>> Can you kindly assist regarding this ? I am new to Hadoop and I am
>>>>> having no idea as how I can proceed with this.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> --
>>>>> Indrashish Basu
>>>>> Graduate Student
>>>>> Department of Electrical and Computer Engineering
>>>>> University of Florida
>>>>>
>>>>>
>> --
>> Indrashish Basu
>> Graduate Student
>> Department of Electrical and Computer Engineering
>> University of Florida
>>
>
> --
>
> Indrashish Basu
> Graduate Student
> Department of Electrical and Computer Engineering
> University of Florida
>
>

Mime
View raw message