Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
MIME-Version: 1.0
References: 
 <CABFJMzP-Ana7u4TFuHq7T7b8RNU1MazXu6ZvisLT4a6nkmxBdQ@mail.gmail.com>
 <CAMs9kVhaeOW70eTjdqn_eVWHpRQFFBQLn6NH6_Cw5bW_NJx-1A@mail.gmail.com>
 <CABFJMzPK4Be2ty8f6cN-2bVigWi_z6a8Aj3zDEMyB41gxO_BSQ@mail.gmail.com>
 <CAMs9kViuRY2TODXgTSDZ+1ik_yxVR6Jni8_nxmFN=d-VHzaHTg@mail.gmail.com>
 <CABFJMzNPEMeT_FEAr_MpoSPjTYMS18HZJwUnw6y9KMkuzF8xTQ@mail.gmail.com>
In-Reply-To: 
 <CABFJMzNPEMeT_FEAr_MpoSPjTYMS18HZJwUnw6y9KMkuzF8xTQ@mail.gmail.com>
From: Marco Reis <ma@marcoreis.net>
Date: Thu, 24 Mar 2016 13:03:23 +0000
Message-ID: 
 <CAGfP+eX_tE8dMmOT-7ikvbLwXPD3XdpgC8TnizWFRcWLcTvUQQ@mail.gmail.com>
Subject: unsubscribe
Cc: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=001a1143e542e2f1a6052ecb1241

--001a1143e542e2f1a6052ecb1241
Content-Type: text/plain; charset=UTF-8

On Thu, Mar 24, 2016 at 5:16 AM Chathuri Wimalasena <kamalasini@gmail.com>
wrote:

> Hi Ravi,
>
> Thank you for all the information, Our application is indexing twitter
> data to HBase and then do some data analytics on top of that. That's why
> HDFS data is very important to us. We cannot tolerate any data loss with
> the update. Do you remember how long it took for you to upgrade it from
> 2.4.1 to 2.7.1 ?
>
> Thanks,
> Chathuri
>
> On Wed, Mar 23, 2016 at 7:09 PM, Ravi Prakash <ravihadoop@gmail.com>
> wrote:
>
>> Hi Chathuri!
>>
>> Technically there is a rollback option during upgrade. I don't know how
>> well it has been tested, but the idea is that old metadata is not deleted
>> until the cluster administrator says $ hdfs dfsadmin -finalizeUpgrade . I'm
>> fairly confident that the HDFS upgrade will work smoothly. We have upgraded
>> quite a few Hadoop-2.4.1 clusters to Hadoop-2.7.1 successfully (never
>> having to roll back). Its your applications that work on top of HDFS and
>> YARN that I'd be concerned about.
>>
>> HTH
>> Ravi
>>
>> On Wed, Mar 23, 2016 at 2:22 PM, Chathuri Wimalasena <
>> kamalasini@gmail.com> wrote:
>>
>>> Thanks for information Ravi. Is there a way that I can back up data
>>> before the  update ? I was thinking about this approach..
>>>
>>> Copy the current hadoop directories to a new set of directories.
>>> Point hadoop to this new set
>>> Start the migration with the backup set
>>>
>>> Please let me know if people have done this upgrade successfully. I
>>> believe many things can go wrong in a lengthy upgrade like this. The data
>>> in the cluster is very important.
>>> Thanks,
>>> Chathuri
>>>
>>> On Wed, Mar 23, 2016 at 4:37 PM, Ravi Prakash <ravihadoop@gmail.com>
>>> wrote:
>>>
>>>> Hi Chathuri!
>>>>
>>>>    - When we upgrade, does it change the namenode data structures and
>>>>    data nodes? I assume it only changes the name node...
>>>>
>>>> It changes the NN as well as DN layout. As a matter of fact, this
>>>> upgrade will take a long time on Datanodes as well because of
>>>> https://issues.apache.org/jira/browse/HDFS-6482
>>>>
>>>>    - What are the risks with this upgrade ?
>>>>
>>>> What Hadoop applications do you run on top of your cluster? The hope is
>>>> that everything continues working smoothly for the most part, but
>>>> inevitably some backward incompatible changes creep in.
>>>>
>>>>    - Is there a place where I can review the changes made to file
>>>>    system from 2.5.1 to 2.7.2?
>>>>
>>>> The release notes. http://hadoop.apache.org/releases.html .You'd have
>>>> to accumulate all the changes in the versions.
>>>>
>>>> Practically, I'd try to run my application on your upgraded test
>>>> cluster.
>>>>
>>>> HTH
>>>>
>>>> Ravi
>>>>
>>>> On Wed, Mar 23, 2016 at 12:17 PM, Chathuri Wimalasena <
>>>> kamalasini@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> We have a hadoop production deployment with 1 name node and 10 data
>>>>> nodes which has more than 20TB of data in HDFS. We are currently using
>>>>> Hadoop 2.5.1 and we want to update it to latest Hadoop version, 2.7.2.
>>>>>
>>>>> I followed the following link (
>>>>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html)
>>>>> and updated a single node system running in pseudo distributed mode and it
>>>>> went without any issues. But this system did not have that much data as the
>>>>> production system.
>>>>>
>>>>> Since this is a production system, I'm reluctant to do this update. I
>>>>> would like to see what other people have done in these cases and their
>>>>> experiences... Here are few questions I have..
>>>>>
>>>>>    - When we upgrade, does it change the namenode data structures and
>>>>>    data nodes? I assume it only changes the name node...
>>>>>    - What are the risks with this upgrade ?
>>>>>    - Is there a place where I can review the changes made to file
>>>>>    system from 2.5.1 to 2.7.2?
>>>>>
>>>>> I would really appreciate if you can share your experiences.
>>>>>
>>>>> Thanks in advance,
>>>>> Chathuri
>>>>>
>>>>
>>>>
>>>
>>
>

--001a1143e542e2f1a6052ecb1241
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Thu=
, Mar 24, 2016 at 5:16 AM Chathuri Wimalasena &lt;<a href=3D"mailto:kamalas=
ini@gmail.com">kamalasini@gmail.com</a>&gt; wrote:<br></div><blockquote cla=
ss=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;pa=
dding-left:1ex"><div dir=3D"ltr">Hi Ravi,=C2=A0<div><br></div><div>Thank yo=
u for all the information, Our application is indexing twitter data to HBas=
e and then do some data analytics on top of that. That&#39;s why HDFS data =
is very important to us. We cannot tolerate any data loss with the update. =
Do you remember how long it took for you to upgrade it from 2.4.1 to 2.7.1 =
?</div><div><br></div><div>Thanks,</div><div>Chathuri=C2=A0</div></div><div=
 class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Wed, Mar 23, 2016 =
at 7:09 PM, Ravi Prakash <span dir=3D"ltr">&lt;<a href=3D"mailto:ravihadoop=
@gmail.com" target=3D"_blank">ravihadoop@gmail.com</a>&gt;</span> wrote:<br=
><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1=
px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div><div>Hi Chathuri=
!<br><br></div>Technically there is a rollback option during upgrade. I don=
&#39;t know how well it has been tested, but the idea is that old metadata =
is not deleted until the cluster administrator says $ hdfs dfsadmin -finali=
zeUpgrade . I&#39;m fairly confident that the HDFS upgrade will work smooth=
ly. We have upgraded quite a few Hadoop-2.4.1 clusters to Hadoop-2.7.1 succ=
essfully (never having to roll back). Its your applications that work on to=
p of HDFS and YARN that I&#39;d be concerned about.<br><br></div>HTH<span><=
font color=3D"#888888"><br></font></span></div><span><font color=3D"#888888=
">Ravi<br></font></span></div><div><div><div class=3D"gmail_extra"><br><div=
 class=3D"gmail_quote">On Wed, Mar 23, 2016 at 2:22 PM, Chathuri Wimalasena=
 <span dir=3D"ltr">&lt;<a href=3D"mailto:kamalasini@gmail.com" target=3D"_b=
lank">kamalasini@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gm=
ail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-le=
ft:1ex"><div dir=3D"ltr">Thanks for information Ravi. Is there a way that I=
 can back up data before the =C2=A0update ? I was thinking about this appro=
ach..<div><br></div><div>Copy the current hadoop directories to a new set o=
f directories.</div><div>Point hadoop to this new set</div><div>Start the m=
igration with the backup set<div><br></div><div>Please let me know if peopl=
e have done this upgrade successfully. I believe many things can go wrong i=
n a lengthy upgrade like this. The data in the cluster is very important.=
=C2=A0</div></div><div>Thanks,</div><div>Chathuri</div></div><div><div><div=
 class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Wed, Mar 23, 2016 =
at 4:37 PM, Ravi Prakash <span dir=3D"ltr">&lt;<a href=3D"mailto:ravihadoop=
@gmail.com" target=3D"_blank">ravihadoop@gmail.com</a>&gt;</span> wrote:<br=
><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1=
px #ccc solid;padding-left:1ex"><div dir=3D"ltr">Hi Chathuri!<br><div><span=
><ul><li>When we upgrade, does it change the namenode data structures and d=
ata nodes? I assume it only changes the name node...</li></ul></span><p>It =
changes the NN as well as DN layout. As a matter of fact, this upgrade will=
 take a long time on Datanodes as well because of <a href=3D"https://issues=
.apache.org/jira/browse/HDFS-6482" target=3D"_blank">https://issues.apache.=
org/jira/browse/HDFS-6482</a><br></p><span><ul><li>What are the risks with =
this upgrade ? <br></li></ul></span><p>What Hadoop applications do you run =
on top of your cluster? The hope is that everything continues working smoot=
hly for the most part, but inevitably some backward incompatible changes cr=
eep in. <br></p><span><ul><li>Is there a place where I can review the chang=
es made to file system from 2.5.1 to 2.7.2?</li></ul></span><p>The release =
notes. <a href=3D"http://hadoop.apache.org/releases.html" target=3D"_blank"=
>http://hadoop.apache.org/releases.html</a> .You&#39;d have to accumulate a=
ll the changes in the versions. <br></p><p>Practically, I&#39;d try to run =
my application on your upgraded test cluster.</p><p>HTH<span><font color=3D=
"#888888"><br></font></span></p><span><font color=3D"#888888"><p>Ravi<br></=
p></font></span></div></div><div><div><div class=3D"gmail_extra"><br><div c=
lass=3D"gmail_quote">On Wed, Mar 23, 2016 at 12:17 PM, Chathuri Wimalasena =
<span dir=3D"ltr">&lt;<a href=3D"mailto:kamalasini@gmail.com" target=3D"_bl=
ank">kamalasini@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gma=
il_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-lef=
t:1ex"><div dir=3D"ltr">Hi,=C2=A0<div><br></div><div>We have a hadoop produ=
ction deployment with 1 name node and 10 data nodes which has more than 20T=
B of data in HDFS. We are currently using Hadoop 2.5.1 and we want to updat=
e it to latest Hadoop version, 2.7.2.=C2=A0</div><div><br></div><div>I foll=
owed the following link (<a href=3D"https://hadoop.apache.org/docs/stable/h=
adoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html" target=3D"_blank">h=
ttps://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRo=
llingUpgrade.html</a>) and updated a single node system running in pseudo d=
istributed mode and it went without any issues. But this system did not hav=
e that much data as the production system.=C2=A0</div><div><br></div><div>S=
ince this is a production system, I&#39;m reluctant to do this update. I wo=
uld like to see what other people have done in these cases and their experi=
ences... Here are few questions I have..</div><div><ul><li>When we upgrade,=
 does it change the namenode data structures and data nodes? I assume it on=
ly changes the name node...</li><li>What are the risks with this upgrade ?=
=C2=A0</li><li>Is there a place where I can review the changes made to file=
 system from 2.5.1 to 2.7.2?</li></ul><div>I would really appreciate if you=
 can share your experiences.</div></div><div><br></div><div>Thanks in advan=
ce,</div><div>Chathuri</div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</blockquote></div></div>

--001a1143e542e2f1a6052ecb1241--