hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe Kernévez <pkerne...@octo.com>
Subject Re: Lots of warning messages and exception in namenode logs
Date Tue, 04 Jul 2017 09:19:18 GMT
Hi all,

>After setting *dfs.replication=2 *, I did a clean start of hdfs.
This should not changed anything. The dfs.replication value is only used
for the new files, the existing files keep their own value.
https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
If you want to change the replication factor for existing files you have to
use the setrep command :
https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/FileSystemShell.html#setrep

I think that the real change was to add 2 more datanodes that increase the
workload distribution.

Regards,
Philippe


On Thu, Jun 29, 2017 at 5:35 PM, Ravi Prakash <ravihadoop@gmail.com> wrote:

> Hi Omprakash!
>
> If both datanodes die at the same time, then yes, data will be lost. In
> that case, you should increase dfs.replication to 3 (so that there will be
> 3 copies). This obviously adversely affects the total amount of data you
> can store on HDFS.
>
> However if only 1 datanode dies, the namenode notices that, and orders the
> remaining replica to be replicated. The rate at which it orders
> re-replication is determined by dfs.namenode.replication.work.multiplier.per.iteration
> and the number of nodes in your cluster. The more nodes you have in your
> cluster (some companies run 1000s of nodes in 1 cluster), the faster the
> lost replicas will be replicated. Let's say there were 2 million blocks on
> each datanode, and you configured only 2 blocks to be re-replicated per
> datanode heartbeat (usually 3 seconds). If there were 2 other datanodes, it
> would take 2000000 / 2 * 3 seconds to re-replicate data. Ofcourse you can't
> crank up the number of blocks re-replicated too high, because there's only
> so much data that datanodes can transfer amongst themselves. You should
> calculate how many blocks you have, how much bandwidth is available between
> any two datanodes, how quickly you want replication (if your disks are only
> re-replicating, jobs may not make progress), and set that configuration
> accordingly. Depending on your datanode capacity it may take 1-2 days to
> rereplicate all the data.
>
> Also, I'd encourage you to read through more of the documentation
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
> and become familiar with the system. There can be a *huge* difference
> between a well-tuned Hadoop cluster and a poorly configured one.
>
> HTH
> Ravi
>
>
> On Thu, Jun 29, 2017 at 4:50 AM, omprakash <omprakashp@cdac.in> wrote:
>
>> Hi Sidharth,
>>
>>
>>
>> Thanks a lot for the clarification. May you suggest parameters that can
>> improve the re-replication in case of failure.
>>
>>
>>
>> Regards
>>
>> Om
>>
>>
>>
>> *From:* Sidharth Kumar [mailto:sidharthkumar2707@gmail.com]
>> *Sent:* 29 June 2017 16:06
>> *To:* omprakash <omprakashp@cdac.in>
>> *Cc:* Arpit Agarwal <aagarwal@hortonworks.com>;
>> common-user@hadoop.apache.org <user@hadoop.apache.org>; Ravi Prakash <
>> ravihadoop@gmail.com>
>>
>> *Subject:* RE: Lots of warning messages and exception in namenode logs
>>
>>
>>
>> Hi,
>>
>>
>>
>> No, as there will be no copy exists of that file. You can increase the
>> replication factor to 3 so that there will be 3 copies created and even if
>> 2 data nodes goes down you will still have one copy available which will be
>> again replicated to 3 by the namenode in due course of time.
>>
>>
>> Warm Regards
>>
>> Sidharth Kumar | Mob: +91 8197 555 599 <+91%2081975%2055599>/7892 192
>> 367 |  LinkedIn:www.linkedin.com/in/sidharthkumar2792
>>
>>
>>
>>
>>
>>
>>
>>
>> On 29-Jun-2017 3:45 PM, "omprakash" <omprakashp@cdac.in> wrote:
>>
>> Hi Ravi,
>>
>>
>>
>> I have 5 nodes in Hadoop cluster and all have same configurations. After
>> setting *dfs.replication=2 *, I did a clean start of hdfs.
>>
>>
>>
>> As per your suggestion, I added 2 more datanodes and clean all the data
>> and metadata. The performance of the cluster has dramatically improved. I
>> can see through logs that the files are randomly replicated to four
>> datanodes (2 replica of each file).
>>
>>
>>
>> But here my problem arise. I want redundant datanodes such that if any
>> two of the datanodes goes down I still be able to get files from other two.
>> In above case suppose file block-xyz get stored on datanode1 and datanode2,
>> and some day these two datanodes goes down , will I be able to access the
>> block-xyz? This is what I am worried about.
>>
>>
>>
>>
>>
>> Regards
>>
>> Om
>>
>>
>>
>>
>>
>> *From:* Ravi Prakash [mailto:ravihadoop@gmail.com]
>> *Sent:* 27 June 2017 22:36
>> *To:* omprakash <omprakashp@cdac.in>
>> *Cc:* Arpit Agarwal <aagarwal@hortonworks.com>; user <
>> user@hadoop.apache.org>
>> *Subject:* Re: Lots of warning messages and exception in namenode logs
>>
>>
>>
>> Hi Omprakash!
>>
>> This is *not* ok. Please go through the datanode logs of the inactive
>> datanode and figure out why its inactive. If you set dfs.replication to 2,
>> atleast as many datanodes (and ideally a LOT more datanodes) should be
>> active and participating in the cluster.
>>
>> Do you have the hdfs-site.xml you posted to the mailing list on all the
>> nodes (including the Namenode)? Was the file containing block
>> *blk_1074074104_337394* created when you had the cluster misconfigured
>> to dfs.replication=3 ? You can determine which file the block belongs to
>> using this command:
>>
>> hdfs fsck -blockId blk_1074074104
>>
>> Once you have the file, you can set its replication using
>> hdfs dfs -setrep 2 <Filename>
>>
>> I'm guessing that you probably have a lot of files with this replication,
>> in which case you should set it on / (This would overwrite the replication
>> on all the files)
>>
>>
>>
>> If the data on this cluster is important I would be very worried about
>> the condition its in.
>>
>> HTH
>>
>> Ravi
>>
>>
>>
>> On Mon, Jun 26, 2017 at 11:22 PM, omprakash <omprakashp@cdac.in> wrote:
>>
>> Hi all,
>>
>>
>>
>> I started the HDFS in DEBUG mode. After examining the logs I found below
>> logs which read that the replication factor required is 3 (as against the
>> specified *dfs.replication=2*).
>>
>>
>>
>> *DEBUG BlockStateChange: BLOCK* NameSystem.UnderReplicationBlock.add:
>> blk_1074074104_337394 has only 1 replicas and need 3 replicas so is added
>> to neededReplications at priority level 0*
>>
>>
>>
>> *P.S : I have 1 datanode active out of 2. *
>>
>>
>>
>> I can also see from Namenode UI that the no. of under replicated blocks
>> are growing.
>>
>>
>>
>> Any idea? Or this is OK.
>>
>>
>>
>> regards
>>
>>
>>
>>
>>
>> *From:* omprakash [mailto:omprakashp@cdac.in]
>> *Sent:* 23 June 2017 11:02
>> *To:* 'Ravi Prakash' <ravihadoop@gmail.com>; 'Arpit Agarwal' <
>> aagarwal@hortonworks.com>
>> *Cc:* 'user' <user@hadoop.apache.org>
>> *Subject:* RE: Lots of warning messages and exception in namenode logs
>>
>>
>>
>> Hi Arpit,
>>
>>
>>
>> I will enable the settings as suggested and will post the results.
>>
>>
>>
>> I am just curious about setting *Namenode RPC service  port*. As I have
>> checked the *hdfs-site.xml* properties, *dfs.namenode.rpc-address *is
>> already set which will be default value to RPC service port also. Does
>> specifying any other port have advantage over default one?
>>
>>
>>
>> Regarding JvmPauseMonitor Error, there are 5-6 instances of this error in namenode
logs. Here is one of them.
>>
>>
>>
>> How to identify the size of heap In such cases as I have 4GB of RAM on
>> the namenode VM.?
>>
>>
>>
>> *@Ravi* Since the file size are very small thus I have only configured a
>> VM with 20 GB space. The additional disk is simple SATA disk not SSD.
>>
>>
>>
>> As I can see from Namenode UI there are more than 50% of block under
>> replicated. I have now 400K blocks out of which 200K are under-replicated.
>>
>> I will post the results again after changing the value of
>> *dfs.namenode.replication.work* <http://dfs.namenode.replication.work>
>> *.multiplier.per.iteration*
>>
>>
>>
>>
>>
>> Thanks
>>
>> Om Prakash
>>
>>
>>
>> *From:* Ravi Prakash [mailto:ravihadoop@gmail.com <ravihadoop@gmail.com>]
>>
>> *Sent:* 22 June 2017 23:04
>> *To:* Arpit Agarwal <aagarwal@hortonworks.com>
>> *Cc:* omprakash <omprakashp@cdac.in>; user <user@hadoop.apache.org>
>>
>>
>> *Subject:* Re: Lots of warning messages and exception in namenode logs
>>
>>
>>
>> Hi Omprakash!
>>
>> How big are your disks? Just 20Gb? Just out of curiosity, are these SSDs?
>>
>> In addition to Arpit's reply, I'm also concerned with the number of
>> under-replicated blocks you have: Under replicated blocks: 141863
>>
>> When there are fewer replicas for a block than there are supposed to be
>> (in your case e.g. when there's 1 replica when there ought to be 2), the
>> namenode will order the datanodes to create more replicas. The rate at
>> which it does this is controlled by
>> dfs.namenode.replication.work.multiplier.per.iteration . Given you have
>> only 2 datanodes, you'll only be re-replicating 4 blocks every 3 seconds.
>> So, it will take quite a while to re-replicate all the blocks.
>>
>> Also, please know that you want files to be much bigger than 1kb. Ideally
>> you'd have a couple of blocks (blocks=128Mb) for each file. You should
>> append to files when they are this small.
>>
>> Please do let us know how things turn out.
>>
>> Cheers,
>>
>> Ravi
>>
>>
>>
>> On Wed, Jun 21, 2017 at 11:23 PM, Arpit Agarwal <aagarwal@hortonworks.com>
>> wrote:
>>
>> Hi Omprakash,
>>
>>
>>
>> Your description suggests DataNodes cannot send timely reports to the
>> NameNode. You can check it by looking for ‘stale’ DataNodes in the NN web
>> UI when this situation is occurring. A few ideas:
>>
>>
>>
>>    - Try increasing the NameNode RPC handler count a bit (set
>>    dfs.namenode.handler.count to 20 in hdfs-site.xml).
>>    - Enable the NameNode service RPC port. This requires downtime and
>>    reformatting the ZKFC znode.
>>    - Search for JvmPauseMonitor messages in your service logs. If you
>>    see any, try increasing JVM heap for that service.
>>    - Enable debug logging as suggested here:
>>
>>
>>
>> *2017-06-21 12:11:30,626 WARN
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed
>> to place enough replicas, still in need of 1 to reach 2
>> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7,
>> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]},
>> newBlock=true) For more information, please enable DEBUG log level on
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and *
>> *org.apache.hadoop.net* <http://org.apache.hadoop.net/>*.NetworkTopology*
>>
>>
>>
>>
>>
>> *From: *omprakash <omprakashp@cdac.in>
>> *Date: *Wednesday, June 21, 2017 at 9:23 PM
>> *To: *'Ravi Prakash' <ravihadoop@gmail.com>
>> *Cc: *'user' <user@hadoop.apache.org>
>> *Subject: *RE: Lots of warning messages and exception in namenode logs
>>
>>
>>
>> Hi Ravi,
>>
>>
>>
>> Pasting below my core-site and hdfs-site  configurations. I have kept
>> bare minimal configurations for my cluster.  The cluster started fine and I
>> was able to put couple of 100K files on hdfs but then when I checked the
>> logs there were errors/Exceptions. After restart of datanodes they work
>> well for few thousand files but same problem again.  No idea what is wrong.
>>
>>
>>
>> *PS: I am pumping 1 file per second to hdfs with aprox size 1KB*
>>
>>
>>
>> I thought it may be due to space quota on datanodes but here is the
>> output of *hdfs dfs -report*. Looks fine to me
>>
>>
>>
>> $ hdfs dfsadmin -report
>>
>>
>>
>> Configured Capacity: 42005069824 (39.12 GB)
>>
>> Present Capacity: 38085839568 (35.47 GB)
>>
>> DFS Remaining: 34949058560 (32.55 GB)
>>
>> DFS Used: 3136781008 <(313)%20678-1008> (2.92 GB)
>>
>> DFS Used%: 8.24%
>>
>> Under replicated blocks: 141863
>>
>> Blocks with corrupt replicas: 0
>>
>> Missing blocks: 0
>>
>> Missing blocks (with replication factor 1): 0
>>
>> Pending deletion blocks: 0
>>
>>
>>
>> -------------------------------------------------
>>
>> Live datanodes (2):
>>
>>
>>
>> Name: 192.168.9.174:50010 (node5)
>>
>> Hostname: node5
>>
>> Decommission Status : Normal
>>
>> Configured Capacity: 21002534912 (19.56 GB)
>>
>> DFS Used: 1764211024 (1.64 GB)
>>
>> Non DFS Used: 811509424 (773.92 MB)
>>
>> DFS Remaining: 17067913216 <(706)%20791-3216> (15.90 GB)
>>
>> DFS Used%: 8.40%
>>
>> DFS Remaining%: 81.27%
>>
>> Configured Cache Capacity: 0 (0 B)
>>
>> Cache Used: 0 (0 B)
>>
>> Cache Remaining: 0 (0 B)
>>
>> Cache Used%: 100.00%
>>
>> Cache Remaining%: 0.00%
>>
>> Xceivers: 2
>>
>> Last contact: Wed Jun 21 14:38:17 IST 2017
>>
>>
>>
>>
>>
>> Name: 192.168.9.225:50010 (node4)
>>
>> Hostname: node5
>>
>> Decommission Status : Normal
>>
>> Configured Capacity: 21002534912 (19.56 GB)
>>
>> DFS Used: 1372569984 (1.28 GB)
>>
>> Non DFS Used: 658353792 (627.86 MB)
>>
>> DFS Remaining: 17881145344 (16.65 GB)
>>
>> DFS Used%: 6.54%
>>
>> DFS Remaining%: 85.14%
>>
>> Configured Cache Capacity: 0 (0 B)
>>
>> Cache Used: 0 (0 B)
>>
>> Cache Remaining: 0 (0 B)
>>
>> Cache Used%: 100.00%
>>
>> Cache Remaining%: 0.00%
>>
>> Xceivers: 1
>>
>> Last contact: Wed Jun 21 14:38:19 IST 2017
>>
>>
>>
>> *core-site.xml*
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>> <configuration>
>>
>> <property>
>>
>>   <name>fs.defaultFS</name>
>>
>>   <value>hdfs://hdfsCluster</value>
>>
>> </property>
>>
>> <property>
>>
>>   <name>dfs.journalnode.edits.dir</name>
>>
>>   <value>/mnt/hadoopData/hadoop/journal/node/local/data</value>
>>
>> </property>
>>
>> </configuration>
>>
>>
>>
>> *hdfs-site.xml*
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>> <configuration>
>>
>> *<property>*
>>
>> *<name>dfs.replication</name>*
>>
>> *<value>2</value>*
>>
>> *</property>*
>>
>> <property>
>>
>>   <name>dfs.name.dir</name>
>>
>>     <value>file:///mnt/hadoopData/hadoop/hdfs/namenode</value>
>>
>> </property>
>>
>> <property>
>>
>>   <name>dfs.data.dir</name>
>>
>>     <value>file:///mnt/hadoopData/hadoop/hdfs/datanode</value>
>>
>> </property>
>>
>> <property>
>>
>> <name>dfs.nameservices</name>
>>
>> <value>hdfsCluster</value>
>>
>> </property>
>>
>> <property>
>>
>>   <name>dfs.ha.namenodes.hdfsCluster</name>
>>
>>   <value>nn1,nn2</value>
>>
>> </property>
>>
>>
>>
>> <property>
>>
>>   <name>dfs.namenode.rpc-address.hdfsCluster.nn1</name>
>>
>>   <value>node1:8020</value>
>>
>> </property>
>>
>> <property>
>>
>>   <name>dfs.namenode.rpc-address.hdfsCluster.nn2</name>
>>
>>   <value>node22:8020</value>
>>
>> </property>
>>
>>
>>
>> <property>
>>
>>   <name>dfs.namenode.http-address.hdfsCluster.nn1</name>
>>
>>   <value>node1:50070</value>
>>
>> </property>
>>
>> <property>
>>
>>   <name>dfs.namenode.http-address.hdfsCluster.nn2</name>
>>
>>   <value>node2:50070</value>
>>
>> </property>
>>
>>
>>
>> <property>
>>
>>   <name>dfs.namenode.shared.edits.dir</name>
>>
>>   <value>qjournal://node1:8485;node2:8485;node3:8485;node4:848
>> 5;node5:8485/hdfsCluster</value>
>>
>> </property>
>>
>> <property>
>>
>>   <name>dfs.client.failover.proxy.provider.hdfsCluster</name>
>>
>>   <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredF
>> ailoverProxyProvider</value>
>>
>> </property>
>>
>> <property>
>>
>>    <name>ha.zookeeper.quorum</name>
>>
>>    <value>node1:2181,node2:2181,node3:2181,node4:2181,node5:2181</value>
>>
>> </property>
>>
>> <property>
>>
>> <name>dfs.ha.fencing.methods</name>
>>
>> <value>sshfence</value>
>>
>> </property>
>>
>> <property>
>>
>> <name>dfs.ha.fencing.ssh.private-key-files</name>
>>
>> <value>/home/hadoop/.ssh/id_rsa</value>
>>
>> </property>
>>
>> <property>
>>
>>    <name>dfs.ha.automatic-failover.enabled</name>
>>
>>    <value>true</value>
>>
>> </property>
>>
>> </configuration>
>>
>>
>>
>>
>>
>> *From:* Ravi Prakash [mailto:ravihadoop@gmail.com]
>> *Sent:* 22 June 2017 02:38
>> *To:* omprakash <omprakashp@cdac.in>
>> *Cc:* user <user@hadoop.apache.org>
>> *Subject:* Re: Lots of warning messages and exception in namenode logs
>>
>>
>>
>> Hi Omprakash!
>>
>> What is your default replication set to? What kind of disks do your
>> datanodes have? Were you able to start a cluster with a simple
>> configuration before you started tuning it?
>>
>> HDFS tries to create the default number of replicas for a block on
>> different datanodes. The Namenode tries to give a list of datanodes that
>> the client can write replicas of the block to. If the Namenode is not able
>> to construct a list with adequate number of datanodes, you will see the
>> message you are seeing. This may mean that datanodes are unhealthy (failed
>> disks), or full (disks have no more space), being decomissioned ( HDFS will
>> not write replicas on decomissioning datanodes) or misconfigured ( I'd
>> suggest turning on storage classes only after a simple configuration works).
>>
>> When a client that was trying to write a file was killed (e.g. if you
>> killed your MR job), after some time (hard limit expiring) the Namenode
>> will try to recover the file. In your case the namenode is also not able to
>> find enough datanodes for recovering the files.
>>
>>
>>
>> HTH
>>
>> Ravi
>>
>>
>>
>>
>>
>> On Tue, Jun 20, 2017 at 11:50 PM, omprakash <omprakashp@cdac.in> wrote:
>>
>> Hi,
>>
>>
>>
>> I am receiving lots of  *warning messages in namenodes* logs on ACTIVE
>> NN in my *HA Hadoop setup*. Below are the logs
>>
>>
>>
>> *“2017-06-21 12:11:26,523 WARN
>> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough
>> replicas: expected size is 1 but only 0 storage types can be selected
>> (replication=2, selected=[], unavailable=[DISK], removed=[DISK],
>> policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[],
>> replicationFallbacks=[ARCHIVE]})*
>>
>> *2017-06-21 12:11:26,523 WARN
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed
>> to place enough replicas, still in need of 1 to reach 2
>> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7,
>> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]},
>> newBlock=true) All required storage types are unavailable:
>> unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7,
>> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}*
>>
>> *2017-06-21 12:11:26,523 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>> allocate blk_1073894332_153508, replicas=**192.168.9.174:50010*
>> <http://192.168.9.174:50010>* for /36962._COPYING_*
>>
>> *2017-06-21 12:11:26,810 INFO org.apache.hadoop.hdfs.StateChange: DIR*
>> completeFile: /36962._COPYING_ is closed by
>> DFSClient_NONMAPREDUCE_146762699_1*
>>
>> *2017-06-21 12:11:30,626 WARN
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed
>> to place enough replicas, still in need of 1 to reach 2
>> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7,
>> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]},
>> newBlock=true) For more information, please enable DEBUG log level on
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and *
>> *org.apache.hadoop.net* <http://org.apache.hadoop.net>*.NetworkTopology*
>>
>> *2017-06-21 12:11:30,626 WARN
>> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough
>> replicas: expected size is 1 but only 0 storage types can be selected
>> (replication=2, selected=[], unavailable=[DISK], removed=[DISK],
>> policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[],
>> replicationFallbacks=[ARCHIVE]})”*
>>
>>
>>
>> I am also encountering exceptions in active namenode related to
>> LeaseManager
>>
>>
>>
>> *2017-06-21 12:13:16,706 INFO
>> org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease.  Holder:
>> DFSClient_NONMAPREDUCE_409197282_362092, pending creates: 1] has expired
>> hard limit*
>>
>> *2017-06-21 12:13:16,706 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease.
>> Holder: DFSClient_NONMAPREDUCE_409197282_362092, pending creates: 1],
>> src=/user/hadoop/**2106201707* <(210)%20620-1707>
>> */02d5adda-d90f-47cb-85d5-999a079f4d79*
>>
>> *2017-06-21 12:13:16,706 WARN org.apache.hadoop.hdfs.StateChange: DIR*
>> NameSystem.internalReleaseLease: Failed to release lease for file
>> /user/hadoop/**2106201707* <(210)%20620-1707>*/02d5adda-d90f-47cb-85d5-999a079f4d79.
>> Committed blocks are waiting to be minimally replicated. Try again later.*
>>
>> *2017-06-21 12:13:16,706 ERROR
>> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Cannot release the
>> path /user/hadoop/**2106201707* <(210)%20620-1707>*/02d5adda-d90f-47cb-85d5-999a079f4d79
>> in the lease [Lease.  Holder: DFSClient_NONMAPREDUCE_409197282_362092,
>> pending creates: 1]*
>>
>> *org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: DIR*
>> NameSystem.internalReleaseLease: Failed to release lease for file
>> /user/hadoop/**2106201707* <(210)%20620-1707>*/02d5adda-d90f-47cb-85d5-999a079f4d79.
>> Committed blocks are waiting to be minimally replicated. Try again later.*
>>
>> *        at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3200)*
>>
>> *        at
>> org.apache.hadoop.hdfs.server.namenode.LeaseManager.checkLeases(LeaseManager.java:383)*
>>
>> *        at
>> org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:329)*
>>
>> *        at java.lang.Thread.run(Thread.java:745)*
>>
>>
>>
>> I have checked the two datanodes. Both are running and have enough space
>> for new data.
>>
>>
>>
>> *PS: I have 2 Namenode and 2 datanodes in Hadoop HA setup. The HA is
>> setuped using Qourom Journal Manager and  Zookeeper server.*
>>
>>
>>
>> Any idea why these errors?
>>
>>
>>
>> *Regards*
>>
>> *Omprakash Paliwal*
>>
>> HPC-Medical and Bioinformatics Applications Group
>>
>> Centre for Development of Advanced Computing (C-DAC)
>>
>> Pune University campus,
>>
>> PUNE-411007
>>
>> Maharashtra, India
>>
>> email:omprakashp@cdac.in
>>
>> Contact : +91-20-25704231 <+91%2020%202570%204231>
>>
>>
>>
>>
>> ------------------------------------------------------------
>> -------------------------------------------------------------------
>> [ C-DAC is on Social-Media too. Kindly follow us at:
>> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>>
>> This e-mail is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. If you are not the
>> intended recipient, please contact the sender by reply e-mail and destroy
>> all copies and the original message. Any unauthorized review, use,
>> disclosure, dissemination, forwarding, printing or copying of this email
>> is strictly prohibited and appropriate legal action will be taken.
>> ------------------------------------------------------------
>> -------------------------------------------------------------------
>>
>>
>>
>>
>> ------------------------------------------------------------
>> -------------------------------------------------------------------
>> [ C-DAC is on Social-Media too. Kindly follow us at:
>> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>>
>> This e-mail is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. If you are not the
>> intended recipient, please contact the sender by reply e-mail and destroy
>> all copies and the original message. Any unauthorized review, use,
>> disclosure, dissemination, forwarding, printing or copying of this email
>> is strictly prohibited and appropriate legal action will be taken.
>> ------------------------------------------------------------
>> -------------------------------------------------------------------
>>
>>
>>
>>
>> ------------------------------------------------------------
>> -------------------------------------------------------------------
>> [ C-DAC is on Social-Media too. Kindly follow us at:
>> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>>
>> This e-mail is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. If you are not the
>> intended recipient, please contact the sender by reply e-mail and destroy
>> all copies and the original message. Any unauthorized review, use,
>> disclosure, dissemination, forwarding, printing or copying of this email
>> is strictly prohibited and appropriate legal action will be taken.
>> ------------------------------------------------------------
>> -------------------------------------------------------------------
>>
>>
>>
>>
>> ------------------------------------------------------------
>> -------------------------------------------------------------------
>> [ C-DAC is on Social-Media too. Kindly follow us at:
>> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>>
>> This e-mail is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. If you are not the
>> intended recipient, please contact the sender by reply e-mail and destroy
>> all copies and the original message. Any unauthorized review, use,
>> disclosure, dissemination, forwarding, printing or copying of this email
>> is strictly prohibited and appropriate legal action will be taken.
>> ------------------------------------------------------------
>> -------------------------------------------------------------------
>>
>>
>> ------------------------------------------------------------
>> -------------------------------------------------------------------
>> [ C-DAC is on Social-Media too. Kindly follow us at:
>> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>>
>> This e-mail is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. If you are not the
>> intended recipient, please contact the sender by reply e-mail and destroy
>> all copies and the original message. Any unauthorized review, use,
>> disclosure, dissemination, forwarding, printing or copying of this email
>> is strictly prohibited and appropriate legal action will be taken.
>> ------------------------------------------------------------
>> -------------------------------------------------------------------
>>
>
>


-- 
Philippe Kernévez



Directeur technique (Suisse),
pkernevez@octo.com
+41 79 888 33 32

Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
OCTO Technology http://www.octo.ch

Mime
View raw message