Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 78308DB44 for ; Thu, 20 Dec 2012 17:29:32 +0000 (UTC) Received: (qmail 56578 invoked by uid 500); 20 Dec 2012 17:29:27 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 56471 invoked by uid 500); 20 Dec 2012 17:29:27 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 56462 invoked by uid 99); 20 Dec 2012 17:29:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Dec 2012 17:29:27 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [206.47.136.141] (HELO Spam1.prd.mpac.ca) (206.47.136.141) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Dec 2012 17:29:21 +0000 Received: from Spam1.prd.mpac.ca (unknown [127.0.0.1]) by IMSVA80 (Postfix) with ESMTP id 789551D805E for ; Thu, 20 Dec 2012 12:28:57 -0500 (EST) Received: from SMAIL1.prd.mpac.ca (unknown [172.29.2.53]) by Spam1.prd.mpac.ca (Postfix) with ESMTP id 320B81D805B for ; Thu, 20 Dec 2012 12:28:57 -0500 (EST) Received: from SMAIL1.prd.mpac.ca ([fe80::d548:4221:967c:4cfb]) by SMAIL1.prd.mpac.ca ([fe80::d548:4221:967c:4cfb%16]) with mapi id 14.02.0318.004; Thu, 20 Dec 2012 12:28:56 -0500 From: "Kartashov, Andy" To: "user@hadoop.apache.org" Subject: RE: steps to fix data block corruption after server failure Thread-Topic: steps to fix data block corruption after server failure Thread-Index: Ac3ev/8mio+hoeG2RDKz17XwQbbpNwAFhYQw Date: Thu, 20 Dec 2012 17:28:56 +0000 Message-ID: References: <39F7E49B848983498EE0B017EA4AC6E12C63DB@SREMBZ.in.telecom.lt> In-Reply-To: <39F7E49B848983498EE0B017EA4AC6E12C63DB@SREMBZ.in.telecom.lt> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.29.61.126] Content-Type: multipart/alternative; boundary="_000_BD42F346AE90F544A731516A805D1B8AD91E70SMAIL1prdmpacca_" MIME-Version: 1.0 X-TM-AS-Product-Ver: IMSVA-8.0.0.1304-6.5.0.1024-19468.000 X-TM-AS-Result: No--25.708-5.0-31-10 X-imss-scan-details: No--25.708-5.0-31-10 X-TM-AS-Result-Xfilter: Match text exemption rules:No X-TMASE-MatchedRID: cxtZ8fwm3r8I80CjS5juaev8QGaI25e3LsFCV4lf4eGnWMBpjx39lndS jU4HdZWrTJGtod+pAIxRJziX3lV4ZkA48NyksZywZRXzmpILXoNi7uzdBJI35LV5fSMRD1zqDYq 76A/eTGIvODX+gMzCbDTxziLd4uwqhjw2ZI7l1QxPtFVFuufzCzZSZF2MiZBgnnYC/uGwSQ9tFB dFARK8i6hWQF/a8JxPOIkLvVuEpVbhtVvI3rIgyeQoIU4rAATM+N2ySfCCihBk5Xdw2ydQhlgHM j8vR6I/quoHeRWbbkN1QC6UrQbNKpz3nEP4SjjAAM+6FTg/ncpcCsumqz2klwVcVKR9rPRx5fMC zNTXI2akRgEg7nrRyueJm6LClLmH52Vc1LpFcqRnq6caWcIOC/hs+N+bSEhB4pinC0b7AdXwWEa NOnlY9/L8w9/9GZdKhW2BDI+9FsyLMWEQIO6h8PilM7nponT8IOY1LSlRtDQgFgLAZuzYdYahJ8 Qc8eksdvgGbsWpCctcefQrYJsh7GoP2Q08Uz0YNNHZMWDTEbfeR5dcF7n1ecO/l0Ny5PZ5TsAQh feQg2DiT6/Sk8NJhv++gjOGfzBmnEnP9v4KwegksSBZTGCrwn1iQCTDkj//SycX2Jbf9VtBJiJJ B53nBi0RazzMLIYlEdt2/Hutun10M6OZK1v3ow== X-Virus-Checked: Checked by ClamAV on apache.org --_000_BD42F346AE90F544A731516A805D1B8AD91E70SMAIL1prdmpacca_ Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable Tadas, One time I remember disconnecting a bunch of DNodes from my dev cluster i/o= of required, more elegant "exclude". The next thing I learned was my FS was corrupted. I did not care about my = data ( I could re-import it again) but my NN metadata was messed up, so wha= t worked for me was to -delete those corrupted files using "hadoop fsck" co= mmand. Nonetheless, I would like to join you on the question. If neither -move or = -delete options work, is re-formatting the NN the only answer to resolve c= orrupted metadata inside Hadoop FS? From: Tadas Mak=E8inskas [mailto:Tadas.Makcinskas@bdc.lt] Sent: Thursday, December 20, 2012 9:42 AM To: user@hadoop.apache.org Subject: steps to fix data block corruption after server failure Having situation here. Some of our servers went away for a while. As we attached them back to the cluster it appeared that as a result we have multiple Missing/Corrupt blocks and some Mis-replicated blocks. Can't figure out how to solve the issue of restoring the system to a normal working state. Can't figure out neither a nice way to remove those corrupted files, nor a way to restore them. All of these files are in the f= ollowing folders: /user//.Trash /user//.staging what following steps would be advised in order to solve our issue? Thanks, Tadas NOTICE: This e-mail message and any attachments are confidential, subject t= o copyright and may be privileged. Any unauthorized use, copying or disclos= ure is prohibited. If you are not the intended recipient, please delete and= contact the sender immediately. Please consider the environment before pri= nting this e-mail. AVIS : le pr=E9sent courriel et toute piece jointe qui l= 'accompagne sont confidentiels, prot=E9g=E9s par le droit d'auteur et peuve= nt etre couverts par le secret professionnel. Toute utilisation, copie ou d= ivulgation non autoris=E9e est interdite. Si vous n'etes pas le destinatair= e pr=E9vu de ce courriel, supprimez-le et contactez imm=E9diatement l'exp= =E9diteur. Veuillez penser a l'environnement avant d'imprimer le pr=E9sent = courriel --_000_BD42F346AE90F544A731516A805D1B8AD91E70SMAIL1prdmpacca_ Content-Type: text/html; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable

Tadas,

 

One time I remember di= sconnecting a bunch of DNodes from my dev cluster i/o of required, more ele= gant “exclude”.

The next thing I learn= ed was my FS was corrupted.  I did not care about my data ( I could re= -import it again) but my NN metadata was messed up, so what worked for me w= as to –delete those corrupted files using “hadoop fsck” command. 

 

Nonetheless, I would l= ike to join you on the question. If neither –move or  -delete op= tions work, is re-formatting the NN the only answer to resolve corrupted me= tadata inside Hadoop FS?

 

From: Tadas Mak=E8inskas [mailto:Tadas.Makcinskas@bdc.lt]
Sent: Thursday, December 20, 2012 9:42 AM
To: user@hadoop.apache.org
Subject: steps to fix data block corruption after server failure

 

Having situation here. Some of our servers went away for a = while. As 

we attached them back to the cluster it appeared that as a = result we have 

multiple Missing/Corrupt blocks and some Mis-replicated blocks.

 

Can't figure out how to solve the issue of restoring the sy= stem to a 

normal working state. Can't figure out neither a nice way t= o remove those 

corrupted files, nor a way to restore them. All of these fi= les are in the following folders: 

   /user/<user>/.Trash

   /user/<user>/.staging 

 

what following steps would be advised in order to solve our= issue?

 

Thanks, Tadas

 

NOTICE: This e-mail message and any attachments are confidential, subject t= o copyright and may be privileged. Any unauthorized use, copying or disclos= ure is prohibited. If you are not the intended recipient, please delete and= contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr= =E9sent courriel et toute pièce jointe qui l'accompagne sont confide= ntiels, prot=E9g=E9s par le droit d'auteur et peuvent être couverts p= ar le secret professionnel. Toute utilisation, copie ou divulgation non autoris=E9e est interdite. Si vous n'êtes pas le = destinataire pr=E9vu de ce courriel, supprimez-le et contactez imm=E9diatem= ent l'exp=E9diteur. Veuillez penser à l'environnement avant d'imprim= er le pr=E9sent courriel --_000_BD42F346AE90F544A731516A805D1B8AD91E70SMAIL1prdmpacca_--