Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6CBF6DD2B for ; Mon, 19 Nov 2012 15:15:22 +0000 (UTC) Received: (qmail 58310 invoked by uid 500); 19 Nov 2012 15:15:17 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 58161 invoked by uid 500); 19 Nov 2012 15:15:16 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 58130 invoked by uid 99); 19 Nov 2012 15:15:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2012 15:15:16 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [206.47.135.205] (HELO Spam1.prd.mpac.ca) (206.47.135.205) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2012 15:15:09 +0000 Received: from Spam1.prd.mpac.ca (unknown [127.0.0.1]) by IMSVA80 (Postfix) with ESMTP id 488BE1D805A for ; Mon, 19 Nov 2012 10:14:46 -0500 (EST) Received: from SMAIL1.prd.mpac.ca (unknown [172.29.2.53]) by Spam1.prd.mpac.ca (Postfix) with ESMTP id 8707B1D8045 for ; Mon, 19 Nov 2012 10:14:45 -0500 (EST) Received: from SMAIL1.prd.mpac.ca ([fe80::d548:4221:967c:4cfb]) by SMAIL1.prd.mpac.ca ([fe80::18cb:8648:b77f:2b55%11]) with mapi id 14.02.0318.004; Mon, 19 Nov 2012 10:14:45 -0500 From: "Kartashov, Andy" To: "user@hadoop.apache.org" Subject: RE: a question on NameNode Thread-Topic: a question on NameNode Thread-Index: Ac3GYfL0z0JgG/x0TF+CAn6LqJHBDAAKnVGAAAo4jLD//7a4AIAAUuTw Date: Mon, 19 Nov 2012 15:14:44 +0000 Message-ID: References: <106F2F9A-2A45-4B79-B392-4BBCCB2B04E5@123.org> <5DD097F1-A97A-4C36-B8EF-2CB549EE32DB@123.org> In-Reply-To: <5DD097F1-A97A-4C36-B8EF-2CB549EE32DB@123.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.29.60.102] Content-Type: multipart/alternative; boundary="_000_BD42F346AE90F544A731516A805D1B8AD86ED3SMAIL1prdmpacca_" MIME-Version: 1.0 X-TM-AS-Product-Ver: IMSVA-8.0.0.1304-6.5.0.1024-19380.000 X-TM-AS-Result: No--22.429-5.0-31-10 X-imss-scan-details: No--22.429-5.0-31-10 X-TM-AS-Result-Xfilter: Match text exemption rules:No X-TMASE-MatchedRID: ZFzIhWOuIzt5X0FJZbmEpgPZZctd3P4BOhJ9m53n4aC1FeFXyekgdOrn o9PPnEe061WufoDAjXnItMC9Wkl7VsbadjUpHvAJlGudLLtRO1srHkgIan9a0aczwN0ylW5YdkQ O3cKWNoN5CIHX5IVTxJZBSILBhBAf/c/S25H4jhvkKCFOKwAEzMd5cqtswikAbuBZJ8qsrIT5um AbksTs86R73QI21ud/S07Sg6FaHCanCX0iHth7NML6jhga4Ht7UAjrAJWsTe/jB19nvhZ7jS2fS fcXA1chYQLJEEG8qEAzMjB5R/FEZp+QdJZ5NPK0wbRQ2Bpmlioay+BQxgCfhQPAa52tRzEAqGiK z+vA3vXQ23n+P5pIMRQjizXtCXfTWYqLLUX2mAu4jAucHcCqnXcF/0kiqyh4DxjBugJBzzzZ8MU YhpNWM6H/0NvSton7t0o/thIqMYPr9WckeuojKUHrI6vFzzG7tD6MNnI7agQML9Wb3Qh/hZjzLx YjuNCwjy9ZHqwNMuQ8qesxXcMD/Y+axNKleguEZjQijgrFvzr/CWEhT9XYVTssXelfet1UuPFrk RUFXWrgT2zXYa9/nbnE2ijtyO+/E9InQ6AzaofI3Zb58w9jMeDbEDPtPm643HTj9E+si6xGwtjY 6ICS1R09c1IiCjyo X-Virus-Checked: Checked by ClamAV on apache.org --_000_BD42F346AE90F544A731516A805D1B8AD86ED3SMAIL1prdmpacca_ Content-Type: text/plain; charset="us-ascii" Thank you Kai.. One more question please. Does MapReduce run tasks of redundant blocks ? Say you have only 1 block of data replicated 3 times, one block over each of three DNodes, block 1 - DN1 / block 1(replica #1) - DN2 / block1 (replica #2) - DN3 Will MR attempt: a. to start 3 Map tasks (one per replicated block) end execute them all b. to start 3 Map tasks (one per replicated block) end drop the other two as soon as one of the three executed successfully c. will start only 1 Map task (for just one block avoiding all replicated ones) and will attempt to start (another one of the replicated blocks) when and only when the initially task running (say on DN1)failed Thanks, From: Kai Voigt [mailto:k@123.org] Sent: Monday, November 19, 2012 10:01 AM To: user@hadoop.apache.org Subject: Re: a question on NameNode Am 19.11.2012 um 15:43 schrieb "Kartashov, Andy" >: So, what if DN2 is down, i.e. it is not sending any blocks' report. Then NN (I guess) will figure out that it has 2 blocks (3,4) that has no home and that (without replication) it has no way of reconstructing the file A.txt. It must spit the error then. One major feature of HDFS is its redundancy. Blocks are stored more than once (three times by default), so chances are good that another DataNode will have that block and report it during the safe mode phase. So the file will be accessible. Kai -- Kai Voigt k@123.org NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel --_000_BD42F346AE90F544A731516A805D1B8AD86ED3SMAIL1prdmpacca_ Content-Type: text/html; charset="us-ascii"

Thank you Kai.. One more question please.

 

Does MapReduce run tasks of redundant blocks ?

 

Say you have only 1 block of data replicated 3 times, one block over each of three DNodes, block 1 – DN1 / block 1(replica #1) – DN2 / block1 (replica #2) – DN3

 

Will MR attempt:

 

a.       to start 3 Map tasks (one per replicated block) end execute them all

b.      to start 3 Map tasks (one per replicated block) end drop the other two as soon as one of the three executed successfully

c.       will start only 1 Map task (for just one block avoiding all replicated ones) and will attempt to start (another one of the replicated blocks) when and only when the initially task running (say on DN1)failed

 

Thanks,

 

From: Kai Voigt [mailto:k@123.org]
Sent: Monday, November 19, 2012 10:01 AM
To: user@hadoop.apache.org
Subject: Re: a question on NameNode

 

 

Am 19.11.2012 um 15:43 schrieb "Kartashov, Andy" <Andy.Kartashov@mpac.ca>:



So, what if DN2 is down, i.e. it is not sending any blocks’ report.  Then NN (I guess) will figure out that it has 2 blocks (3,4) that has no home and that (without replication) it has no way of reconstructing the file A.txt. It must spit the error then.

 

One major feature of HDFS is its redundancy. Blocks are stored more than once (three times by default), so chances are good that another DataNode will have that block and report it during the safe mode phase. So the file will be accessible.

 

Kai

 

-- 

Kai Voigt

 



 

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel --_000_BD42F346AE90F544A731516A805D1B8AD86ED3SMAIL1prdmpacca_--