hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mohit.kaushik" <mohit.kaus...@orkash.com>
Subject Re: Hadoop logs show "Not scanning suspicious block"
Date Thu, 06 Aug 2015 07:12:39 GMT
It is in production. What might be the cause if Namespace id mapping 
problem. one of my datanode was down for some time. I can not afford a 
format at this point.

Thanks
Mohit Kaushik
On 08/06/2015 12:32 PM, Chinnappan Chandrasekaran wrote:
>
> Hi
>
> This  might be a Namespace id mapping problem. you should remove the 
> version .../data/..current/VERSION file from the data location (if 
> multimode remove everything) and format  format your name node
>
> Note:  if  it is production please think about more.
>
> Regards
>
> chiranchandra
>
> *From:*mohit.kaushik [mailto:mohit.kaushik@orkash.com]
> *Sent:* Thursday, 6 August, 2015 2:59 PM
> *To:* user@hadoop.apache.org
> *Subject:* Hadoop logs show "Not scanning suspicious block"
>
>
>
> -------- Forwarded Message --------
>
> *Subject: *
>
> 	
>
> Re: Problem during compacting a table
>
> *Date: *
>
> 	
>
> Wed, 05 Aug 2015 11:24:28 -0400
>
> *From: *
>
> 	
>
> Josh Elser <josh.elser@gmail.com> <mailto:josh.elser@gmail.com>
>
> *Reply-To: *
>
> 	
>
> user@accumulo.apache.org <mailto:user@accumulo.apache.org>
>
> *To: *
>
> 	
>
> user@accumulo.apache.org <mailto:user@accumulo.apache.org>
>
> I'm not really sure what that error message means without doing more
> digging. Copying your email touser@hadoop.apache.org  <mailto:user@hadoop.apache.org>
 might shed some
> light on what the error means if you want to try that.
>   
> mohit.kaushik wrote:
> > There errors are shown in logs of Hadoop namenode and slaves...
> >
> > *Namenode**log*
> > /2015-08-05 12:05:14,518 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment
> > at 391508
> > 2015-08-05 12:05:14,664 INFO BlockStateChange: BLOCK* ask
> > 192.168.10.121:50010 to replicate blk_1073780327_39560 to datanode(s)
> > 192.168.10.122:50010
> > 2015-08-05 12:05:14,664 INFO BlockStateChange: BLOCK* ask
> > 192.168.10.121:50010 to replicate blk_1073780379_39612 to datanode(s)
> > 192.168.10.122:50010
> > 2015-08-05 12:05:24,621 INFO BlockStateChange: BLOCK* addStoredBlock:
> > blockMap updated: 192.168.10.122:50010 is added to blk_1073782847_42080
> > size 134217728
> > 2015-08-05 12:05:26,665 INFO BlockStateChange: BLOCK* ask
> > 192.168.10.121:50010 to replicate blk_1073780611_39844 to datanode(s)
> > 192.168.10.122:50010
> > 2015-08-05 12:05:27,232 INFO BlockStateChange: BLOCK* addStoredBlock:
> > blockMap updated: 192.168.10.122:50010 is added to blk_1073793941_53178
> > size 134217728
> > 2015-08-05 12:05:27,950 INFO BlockStateChange: BLOCK* addStoredBlock:
> > blockMap updated: 192.168.10.122:50010 is added to blk_1073783859_43092
> > size 134217728
> > 2015-08-05 12:05:28,798 INFO BlockStateChange: BLOCK* addStoredBlock:
> > blockMap updated: 192.168.10.122:50010 is added to blk_1073793387_52620
> > size 22496
> > 2015-08-05 12:05:29,666 INFO BlockStateChange: BLOCK* ask
> > 192.168.10.123:50010 to replicate blk_1073780678_39911 to datanode(s)
> > 192.168.10.121:50010
> > 2015-08-05 12:05:29,666 INFO BlockStateChange: BLOCK* ask
> > 192.168.10.121:50010 to replicate blk_1073780682_39915 to datanode(s)
> > 192.168.10.122:50010
> > 2015-08-05 12:05:32,002 INFO BlockStateChange: BLOCK* addStoredBlock:
> > blockMap updated: 192.168.10.122:50010 is added to
> > blk_1073796582_55826{UCState=UNDER_CONSTRUCTION, truncateBlock=null,
> > primaryNodeIndex=-1,
> > replicas=[ReplicaUC[[DISK]DS-896dada5-52c0-4a69-beed-dfbc5d437fc6:NORMAL:192.168.10.123:50010|RBW],
> > ReplicaUC[[DISK]DS-dd6d6a25-122f-4958-a20b-4ccb82f49f11:NORMAL:192.168.10.121:50010|RBW],
> > ReplicaUC[[DISK]DS-188489f9-89d3-40bd-9d20-9db358d644c9:NORMAL:192.168.10.122:50010|RBW]]}
> > size 0
> > 2015-08-05 12:05:32,072 INFO BlockStateChange: BLOCK* addStoredBlock:
> > blockMap updated: 192.168.10.121:50010 is added to
> > blk_1073796582_55826{UCState=UNDER_CONSTRUCTION, truncateBlock=null,
> > primaryNodeIndex=-1,
> > replicas=[ReplicaUC[[DISK]DS-896dada5-52c0-4a69-beed-dfbc5d437fc6:NORMAL:192.168.10.123:50010|RBW],
> > ReplicaUC[[DISK]DS-dd6d6a25-122f-4958-a20b-4ccb82f49f11:NORMAL:192.168.10.121:50010|RBW],
> > ReplicaUC[[DISK]DS-188489f9-89d3-40bd-9d20-9db358d644c9:NORMAL:192.168.10.122:50010|RBW]]}
> > size 0
> > 2015-08-05 12:05:32,129 INFO BlockStateChange: BLOCK* addStoredBlock:
> > blockMap updated: 192.168.10.123:50010 is added to
> > blk_1073796582_55826{UCState=UNDER_CONSTRUCTION, truncateBlock=null,
> > primaryNodeIndex=-1,
> > replicas=[ReplicaUC[[DISK]DS-896dada5-52c0-4a69-beed-dfbc5d437fc6:NORMAL:192.168.10.123:50010|RBW],
> > ReplicaUC[[DISK]DS-dd6d6a25-122f-4958-a20b-4ccb82f49f11:NORMAL:192.168.10.121:50010|RBW],
> > ReplicaUC[[DISK]DS-188489f9-89d3-40bd-9d20-9db358d644c9:NORMAL:192.168.10.122:50010|RBW]]}
> > size 0/................and more
> >
> > *Slave log **(too many)*
> > /k_1073794728_53972 on DS-896dada5-52c0-4a69-beed-dfbc5d437fc6, because
> > the block scanner is disabled.
> > 2015-08-05 11:50:30,438 INFO
> > org.apache.hadoop.hdfs.server.datanode.BlockScanner: Not scanning
> > suspicious block
> > BP-2102462487-192.168.10.124-1436956492274:blk_1073794738_53982 on
> > DS-896dada5-52c0-4a69-beed-dfbc5d437fc6, because the block scanner is
> > disabled.
> > 2015-08-05 11:50:31,024 INFO
> > org.apache.hadoop.hdfs.server.datanode.BlockScanner: Not scanning
> > suspicious block
> > BP-2102462487-192.168.10.124-1436956492274:blk_1073794728_53972 on
> > DS-896dada5-52c0-4a69-beed-dfbc5d437fc6, because the block scanner is
> > disabled.
> > 2015-08-05 11:50:31,027 INFO
> > org.apache.hadoop.hdfs.server.datanode.BlockScanner: Not scanning
> > suspicious block
> > BP-2102462487-192.168.10.124-1436956492274:blk_1073794738_53982 on
> > DS-896dada5-52c0-4a69-beed-dfbc5d437fc6, because the block scanner is
> > disabled.
> > 2015-08-05 11:50:31,095 INFO
> > org.apache.hadoop.hdfs.server.datanode.BlockScanner: Not scanning
> > suspicious block
> > BP-2102462487-192.168.10.124-1436956492274:blk_1073794740_53984 on
> > DS-896dada5-52c0-4a69-beed-dfbc5d437fc6, because the block scanner is
> > disabled.
> > 2015-08-05 11:50:31,105 INFO
> > org.apache.hadoop.hdfs.server.datanode.BlockScanner: Not scanning
> > suspicious block
> > BP-2102462487-192.168.10.124-1436956492274:blk_1073794740_53984 on
> > DS-896dada5-52c0-4a69-beed-dfbc5d437fc6, because the block scanner is
> > disabled.
> > 2015-08-05 11:50:31,136 INFO
> > org.apache.hadoop.hdfs.server.datanode.BlockScanner: Not scanning
> > suspicious block
> > BP-2102462487-192.168.10.124-1436956492274:blk_1073794740_53984 on
> > DS-896dada5-52c0-4a69-beed-dfbc5d437fc6, because the block scanner is
> > disabled.
> > 2015-08-05 11:50:31,136 INFO
> > org.apache.hadoop.hdfs.server.datanode.BlockScanner: Not scanning
> > suspicious block
> > BP-2102462487-192.168.10.124-1436956492274:blk_1073794740_53984 on
> > DS-896dada5-52c0-4a69-beed-dfbc5d437fc6, because the block scanner is
> > disabled./
> >
> >
> > I am using locality groups so its a *NEED* to compact tables.... plz
> > explain how can I get rid of suspicious blocks.
> >
> > Thanks
> >
> > On 08/05/2015 10:53 AM, mohit.kaushik wrote:
> >> yes, One of my datanode was down because disk was detached for some
> >> time and tserver was lost for that node but Its Up and running again.
> >>
> >> fsck show that the file system is healthy. but with so many msgs
> >> reporting under replicated blocks while my replication factor is 3 it
> >> shows required is 5.
> >>
> >> //user/root/.Trash/Current/accumulo/tables/+r/root_tablet/delete+A0000d29.rf+F0000d28.rf:
> >> Under replicated
> >> BP-2102462487-192.168.10.124-1436956492274:blk_1073796198_55442.
> >> Target Replicas is 5 but found 3 replica(s).///
> >>
> >> Thanks & Regards
> >> Mohit Kaushik
> >>
> >> On 08/04/2015 09:18 PM, John Vines wrote:
> >>> It looks like an hdfs issue. Did a datanode go down? Did you turn
> >>> replication down to 1? The combination of those two errors would
> >>> definitely cause the problems your seeing as the latter disables any
> >>> sort of robustness of the underlying filesystem.
> >>>
> >>> On Tue, Aug 4, 2015 at 8:10 AM mohit.kaushik
> >>> <mohit.kaushik@orkash.com  <mailto:mohit.kaushik@orkash.com>  <mailto:mohit.kaushik@orkash.com>
 <mailto:mohit.kaushik@orkash.com>> wrote:
> >>>
> >>>     On 08/04/2015 05:35 PM, mohit.kaushik wrote:
> >>>>     Hello All,
> >>>>
> >>>>     I am using Apache Accumulo-1.6.3 with Apache Hadoop-2.7.0 on a 3
> >>>>     node cluster. when I give compact command from the shell it
> >>>>     gives the folloing warn.
> >>>>
> >>>>     root@orkash testScan> compact -w
> >>>>     2015-08-04 17:10:52,702 [Shell.audit] INFO : root@orkash
> >>>>     testScan> compact -w
> >>>>     2015-08-04 17:10:52,706 [shell.Shell] INFO : Compacting table ...
> >>>>     2015-08-04 17:12:53,986 [impl.ThriftTransportPool] *WARN :
> >>>>     Thread "shell" stuck on IO  to orkash4:9999 (0) for at least
> >>>>     120034 ms*
> >>>>
> >>>>
> >>>>     Tablet Servers show problem regarding a data block. which is
> >>>>     something like HDFS-8659
> >>>><https://issues.apache.org/jira/browse/HDFS-8659>  <https://issues.apache.org/jira/browse/HDFS-8659>
> >>>>
> >>>>     /2015-08-04 15:00:27,825 [hdfs.DFSClient] WARN : Failed to
> >>>>     connect to /192.168.10.121:50010<http://192.168.10.121:50010>
 <http://192.168.10.121:50010>
> >>>>     for block, add to deadNodes and continue. java.io.IOException:
> >>>>     Got error, status message opReadBlock
> >>>>     BP-2102462487-192.168.10.124-1436956492274:blk_1073780678_39911
> >>>>     received exception
> >>>>     org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException:
> >>>>     Replica not found for
> >>>>     BP-2102462487-192.168.10.124-1436956492274:blk_1073780678_39911,
> >>>>     for OP_READ_BLOCK, self=/192.168.10.121:38752
> >>>><http://192.168.10.121:38752>  <http://192.168.10.121:38752>,
remote=/192.168.10.121:50010
> >>>><http://192.168.10.121:50010>  <http://192.168.10.121:50010>,
for file
> >>>>     /accumulo/tables/h/t-000016s/F000016t.rf, for pool
> >>>>     BP-2102462487-192.168.10.124-1436956492274 block 1073780678_39911//
> >>>>     //java.io.IOException: Got error, status message opReadBlock
> >>>>     BP-2102462487-192.168.10.124-1436956492274:blk_1073780678_39911
> >>>>     received exception
> >>>>     org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException:
> >>>>     Replica not found for
> >>>>     BP-2102462487-192.168.10.124-1436956492274:blk_1073780678_39911,
> >>>>     for OP_READ_BLOCK, self=/192.168.10.121:38752
> >>>><http://192.168.10.121:38752>  <http://192.168.10.121:38752>,
remote=/192.168.10.121:50010
> >>>><http://192.168.10.121:50010>  <http://192.168.10.121:50010>,
for file
> >>>>     /accumulo/tables/h/t-000016s/F000016t.rf, for pool
> >>>>     BP-2102462487-192.168.10.124-1436956492274 block 1073780678_39911//
> >>>>     //        at
> >>>>     org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)//
> >>>>     //        at
> >>>>     org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:456)//
> >>>>     //        at
> >>>>     org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:424)//
> >>>>     //        at
> >>>>     org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:814)//
> >>>>     //        at
> >>>>     org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:693)//
> >>>>     //        at
> >>>>     org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:352)//
> >>>>     //        at
> >>>>     org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:618)//
> >>>>     //        at
> >>>>     org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844)//
> >>>>     //        at
> >>>>     org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896)//
> >>>>     //        at
> >>>>     java.io.DataInputStream.read(DataInputStream.java:149)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.bcfile.BoundedRangeFileInputStream$1.run(BoundedRangeFileInputStream.java:104)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.bcfile.BoundedRangeFileInputStream$1.run(BoundedRangeFileInputStream.java:100)//
> >>>>     //        at java.security.AccessController.doPrivileged(Native
> >>>>     Method)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.bcfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:100)//
> >>>>     //        at
> >>>>     org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:159)//
> >>>>     //        at
> >>>>     org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:143)//
> >>>>     //        at
> >>>>     org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)//
> >>>>     //        at
> >>>>     java.io.BufferedInputStream.fill(BufferedInputStream.java:235)//
> >>>>     //        at
> >>>>     java.io.BufferedInputStream.read(BufferedInputStream.java:254)//
> >>>>     //        at
> >>>>     java.io.FilterInputStream.read(FilterInputStream.java:83)//
> >>>>     //        at
> >>>>     java.io.DataInputStream.readInt(DataInputStream.java:387)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.MultiLevelIndex$IndexBlock.readFields(MultiLevelIndex.java:269)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.MultiLevelIndex$Reader.getIndexBlock(MultiLevelIndex.java:724)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.MultiLevelIndex$Reader.access$100(MultiLevelIndex.java:497)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.MultiLevelIndex$Reader$Node.getNext(MultiLevelIndex.java:587)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.MultiLevelIndex$Reader$Node.getNextNode(MultiLevelIndex.java:593)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.MultiLevelIndex$Reader$IndexIterator.getNextNode(MultiLevelIndex.java:616)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.MultiLevelIndex$Reader$IndexIterator.next(MultiLevelIndex.java:659)//
> >>>>     //        at
> >>>>     org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._next(RFile.java:559)/
> >>>>
> >>>>     Regards
> >>>>     Mohit Kaushik
> >>>>
> >>>>     **
> >>>>
> >>>     And Compaction never completes
> >>>
>   
>   
>
>
>
> ______________________________________________________________________
> This email has been scanned by the Symantec Email Security.cloud service.
> For more information please visit http://www.symanteccloud.com
> ______________________________________________________________________
>
> /For general enquiries, please contact us at JOS Enquiry Email: 
> //enquiry@jos.com.sg/ <mailto:enquiry@jos.com.sg>/Hotline: (+65) 6551 
> 9611**/
>
> /For JOS Support, please contact us at JOS Services Email: 
> //services@jos.com.sg/ <mailto:services@jos.com.sg>/Hotline: (+65) 
> 6484 2302/
>
> **
>
> */A member of the Jardine Matheson Group, Jardine OneSolution is one 
> of Asia’s leading providers of integrated IT services and solutions 
> with offices in Singapore, Malaysia, Hong Kong and China. Find out 
> more about JOS at /**/www.jos.com/* <http://www.jos.com>**
>
> *//*
> Confidentiality Notice and Disclaimer:
> This email (including any attachment to it) is confidential and 
> intended only for the use of the individual or entity named above and 
> may contain information that is privileged. If you are not the 
> intended recipient, you are notified that any dissemination, 
> distribution or copying of this email is strictly prohibited. If you 
> have received this email in error, please notify us immediately by 
> return email or telephone and destroy the original message (including 
> any attachment to it). Thank you.*//*
>
>
> ______________________________________________________________________
> This email has been scanned by the Symantec Email Security.cloud service.
> Confidentiality Notice and Disclaimer: This email (including any 
> attachment to it) is confidential and intended only for the use of the 
> individual or entity named above and may contain information that is 
> privileged. If you are not the intended recipient, you are notified 
> that any dissemination, distribution or copying of this email is 
> strictly prohibited. If you have received this email in error, please 
> notify us immediately by return email or telephone and destroy the 
> original message (including any attachment to it). Thank you.
> ______________________________________________________________________


-- 
Signature

*Mohit Kaushik*
Software Engineer
A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
*Tel:*+91 (124) 4969352 | *Fax:*+91 (124) 4033553

<http://politicomapper.orkash.com>interactive social intelligence at work...

<https://www.facebook.com/Orkash2012> 
<http://www.linkedin.com/company/orkash-services-private-limited> 
<https://twitter.com/Orkash> <http://www.orkash.com/blog/> 
<http://www.orkash.com>
<http://www.orkash.com> ... ensuring Assurance in complexity and uncertainty

/This message including the attachments, if any, is a confidential 
business communication. If you are not the intended recipient it may be 
unlawful for you to read, copy, distribute, disclose or otherwise use 
the information in this e-mail. If you have received it in error or are 
not the intended recipient, please destroy it and notify the sender 
immediately. Thank you /


Mime
View raw message