Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
MIME-Version: 1.0
In-Reply-To: <E42A7C57-D3FB-419F-A956-CC0024D9818A@163.com>
References: <E42A7C57-D3FB-419F-A956-CC0024D9818A@163.com>
Date: Tue, 12 Apr 2016 11:14:37 +0100
Message-ID: 
 <CADfcUwGSbXaX7t99J_E-9Ov7rALxBiWcLJEwLKCr5R5yKMPTnw@mail.gmail.com>
Subject: Re: Best way to migrate PB scale data between live cluster?
From: cs user <acldstkusr@gmail.com>
To: raymond <rgbbones@163.com>
Cc: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=001a114111bcb09ce8053046edaf

--001a114111bcb09ce8053046edaf
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi there,

At some point in the near future we are also going to require exactly what
you describe. We had hope to use distcp.

You mentioned:

1. it do not handle data delete

distcp has a -delete flag which says -

"Delete the files existing in the dst but not in src"

Does this not help with handling deleted data?

I believe there is an issue if data is removed during a distcp run, so for
example at the start of the run it captures all the files it needs to sync.
If some files are deleted during the run, it may lead to errors. Is there a
way to ignore these errors and have distcp retry on the next run?

I'd be interested in how you manage to eventually accomplish the syncing
between the two clusters, because we also need to solve the very same
problem :-)

Perhaps others on the mailing list have experience with this?


Thanks!


On Tue, Apr 12, 2016 at 10:44 AM, raymond <rgbbones@163.com> wrote:

> Hi
>
>
> We have a hadoop cluster with several PB data. and we need to migrate it
> to a new cluster across datacenter for larger volume capability.
> We estimate that the data copy itself might took near a month to finish.
> So we are seeking for a sound solution. The requirement is as below:
> 1. we cannot bring down the old cluster for such a long time ( of course)=
,
> and a couple of hours is acceptable.
> 2. we need to mirror the data, it means that we not only need to copy the
> new data, but also need to delete the deleted data happened during the
> migration period.
> 3. we don=E2=80=99t have much space left on the old cluster, say 30% room=
.
>
>
> regarding distcp, although it might be the easiest way , but
>
>
> 1. it do not handle data delete
> 2. it handle newly appended file by compare file size and overwrite it (
> well , it might waste a lot of bandwidth )
> 3. error handling base on file is triffle.
> 4 load control is difficult ( we still have heavy work load on old
> cluster) you can just try to split your work manually and make it small
> enough to achieve the flow control goal.
>
>
> In one word, for a long time mirror work. It won't do well by itself.
>
>
> The are some possible works might need to be done :
>
>
> We can:
>
>
>
>    1. Do  some wrap work around distcp to make it works better. ( say
>    error handling, check results. Extra code for sync deleted files etc. =
)
>    2. Utilize Snapshot mechanisms for better identify files need to be
>    copied and deleted. Or renamed.
>
>
> Or
>
>
>
>    1. Forget about distcp. Use FSIMAGE and editlog as a change history
>    source, and write our own code to replay the operation. Handle each fi=
le
>    one by one. ( better per file error handling could be achieved), but t=
his
>    might need a lot of dev works.
>
>
>
>
> Btw. The closest thing I could found is facebook migration 30PB hive
> warehouse:
>
>
>
> https://www.facebook.com/notes/facebook-engineering/moving-an-elephant-la=
rge-scale-hadoop-data-migration-at-facebook/10150246275318920/
>
>
> They modifiy the distcp to do a initial bulk load (to better handling
> large files and very small files, for load balance I guess.) , and a
> replication system (not much detail on this part) to mirror the changes.
>
>
> But it is not clear that how they handle those shortcomings of distcp I
> mentioned above. And do they utilize snapshot mechanism.
>
>
> So , does anyone have experience on this kind of work? What do you think
> might be the best approaching for our case? Is there any ready works been
> done that we can utilize? Is there any works have been done around snapsh=
ot
> mechanism to easy data migration?
>

--001a114111bcb09ce8053046edaf
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi there,=C2=A0<div><br></div><div>At some point in the ne=
ar future we are also going to require exactly what you describe. We had ho=
pe to use distcp.=C2=A0</div><div><br></div><div>You mentioned:</div><div><=
br></div><div><span style=3D"font-family:Calibri;font-size:14.6667px">1. it=
 do not handle data delete</span><br></div><div><span style=3D"font-family:=
Calibri;font-size:14.6667px"><br></span></div><div><font face=3D"Calibri"><=
span style=3D"font-size:14.6667px">distcp has a -delete flag which says -=
=C2=A0</span></font></div><div><font face=3D"Calibri"><span style=3D"font-s=
ize:14.6667px"><br></span></font></div><div><font face=3D"Calibri"><span st=
yle=3D"font-size:14.6667px">&quot;</span></font><span style=3D"color:rgb(0,=
0,0);font-family:Verdana,Helvetica,sans-serif;font-size:12.8px">Delete the =
files existing in the dst but not in src&quot;</span></div><div><span style=
=3D"color:rgb(0,0,0);font-family:Verdana,Helvetica,sans-serif;font-size:12.=
8px"><br></span></div><div><span style=3D"color:rgb(0,0,0);font-family:Verd=
ana,Helvetica,sans-serif;font-size:12.8px">Does this not help with handling=
 deleted data?</span></div><div><span style=3D"color:rgb(0,0,0);font-family=
:Verdana,Helvetica,sans-serif;font-size:12.8px"><br></span></div><div><span=
 style=3D"color:rgb(0,0,0);font-family:Verdana,Helvetica,sans-serif;font-si=
ze:12.8px">I believe there is an issue if data is removed during a distcp r=
un, so for example at the start of the run it captures all the files it nee=
ds to sync. If some files are deleted during the run, it may lead to errors=
. Is there a way to ignore these errors and have distcp retry on the next r=
un?</span></div><div><span style=3D"color:rgb(0,0,0);font-family:Verdana,He=
lvetica,sans-serif;font-size:12.8px"><br></span></div><div><font color=3D"#=
000000" face=3D"Verdana, Helvetica, sans-serif"><span style=3D"font-size:12=
.8px">I&#39;d be interested in how you manage to eventually=C2=A0accomplish=
=C2=A0the syncing between the two clusters, because we also need to solve t=
he very same problem :-)</span></font></div><div><span style=3D"color:rgb(0=
,0,0);font-family:Verdana,Helvetica,sans-serif;font-size:12.8px"><br></span=
></div><div><span style=3D"color:rgb(0,0,0);font-family:Verdana,Helvetica,s=
ans-serif;font-size:12.8px">Perhaps others on the mailing list have experie=
nce with this?</span></div><div><span style=3D"color:rgb(0,0,0);font-family=
:Verdana,Helvetica,sans-serif;font-size:12.8px"><br></span></div><div><span=
 style=3D"color:rgb(0,0,0);font-family:Verdana,Helvetica,sans-serif;font-si=
ze:12.8px"><br></span></div><div><span style=3D"color:rgb(0,0,0);font-famil=
y:Verdana,Helvetica,sans-serif;font-size:12.8px">Thanks!</span></div><div><=
span style=3D"color:rgb(0,0,0);font-family:Verdana,Helvetica,sans-serif;fon=
t-size:12.8px"><br></span></div></div><div class=3D"gmail_extra"><br><div c=
lass=3D"gmail_quote">On Tue, Apr 12, 2016 at 10:44 AM, raymond <span dir=3D=
"ltr">&lt;<a href=3D"mailto:rgbbones@163.com" target=3D"_blank">rgbbones@16=
3.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"m=
argin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style=3D=
"word-wrap:break-word">


<div style=3D"margin:0in;font-family:Calibri;font-size:11pt">


<div style=3D"margin:0in;font-size:11pt">


<div style=3D"margin:0in;font-size:11pt">Hi</div><p style=3D"margin:0in;fon=
t-size:11pt">=C2=A0</p><div style=3D"margin:0in;font-size:11pt">We have a h=
adoop
cluster with several PB data. and we need to migrate it to a new cluster ac=
ross
datacenter for larger volume capability.</div><div style=3D"margin:0in;font=
-size:11pt">We estimate that the
data copy itself might took near a month to finish. So we are seeking for a
sound solution. The requirement is as below:</div><div style=3D"margin:0in;=
font-size:11pt">1. we cannot bring
down the old cluster for such a long time ( of course), and a couple of hou=
rs
is acceptable.</div><div style=3D"margin:0in;font-size:11pt">2. we need to =
mirror
the data, it means that we not only need to copy the new data, but also nee=
d to
delete the deleted data happened during the migration period.</div><div sty=
le=3D"margin:0in;font-size:11pt">3. we don=E2=80=99t have
much space left on the old cluster, say 30% room.</div><p style=3D"margin:0=
in;font-size:11pt" lang=3D"en-US">=C2=A0</p><div style=3D"margin:0in;font-s=
ize:11pt">regarding distcp,
although it might be the easiest way , but=C2=A0</div><p style=3D"margin:0i=
n;font-size:11pt">=C2=A0</p><div style=3D"margin:0in;font-size:11pt">1. it =
do not handle
data delete</div><div style=3D"margin:0in;font-size:11pt">2. it handle newl=
y
appended file by compare file size and overwrite it ( well , it might waste=
 a
lot of bandwidth )</div><div style=3D"margin:0in;font-size:11pt">3. error h=
andling
base on file is triffle.=C2=A0</div><div style=3D"margin:0in;font-size:11pt=
">4 load control is
difficult ( we still have heavy work load on old cluster) you can just try =
to
split your work manually and make it small enough to achieve the flow contr=
ol
goal.</div><p style=3D"margin:0in;font-size:11pt">=C2=A0</p><div style=3D"m=
argin:0in;font-size:11pt">In one word, for a
long time mirror work. It won&#39;t do well by itself.</div><p style=3D"mar=
gin:0in;font-size:11pt">=C2=A0</p><div style=3D"margin:0in;font-size:11pt">=
The are some
possible works might need to be done :</div><p style=3D"margin:0in;font-siz=
e:11pt">=C2=A0</p><div style=3D"margin:0in;font-size:11pt">We can:</div><p =
style=3D"margin:0in;font-size:11pt">=C2=A0</p>

<ol type=3D"1" style=3D"margin-left:0.375in;direction:ltr;margin-top:0in;ma=
rgin-bottom:0in;font-size:11pt">
 <li value=3D"1" style=3D"margin-top:0;margin-bottom:0;vertical-align:middl=
e"><span style=3D"font-size:11pt">Do=C2=A0 some wrap work around distcp to =
make it
     works better. ( say error handling, check results. Extra code for sync
     deleted files etc. )</span></li>
 <li style=3D"margin-top:0;margin-bottom:0;vertical-align:middle"><span sty=
le=3D"font-size:11pt">Utilize Snapshot mechanisms
     for better identify files need to be copied and deleted. Or renamed.</=
span></li>
</ol><p style=3D"margin:0in;font-size:11pt" lang=3D"en-US">=C2=A0</p><div s=
tyle=3D"margin:0in;font-size:11pt">Or</div><p style=3D"margin:0in;font-size=
:11pt" lang=3D"en-US">=C2=A0</p>

<ol type=3D"1" style=3D"margin-left:0.375in;direction:ltr;margin-top:0in;ma=
rgin-bottom:0in;font-size:11pt">
 <li value=3D"1" style=3D"margin-top:0;margin-bottom:0;vertical-align:middl=
e" lang=3D"en-US"><span style=3D"font-size:11pt">Forget
     about distcp. Use FSIMAGE and editlog as a change history source, and
     write our own code to replay the operation. Handle each file one by on=
e. (
     better per file error handling could be achieved), but this might need=
 a
     lot of dev works. </span></li>
</ol><p style=3D"margin:0in;font-size:11pt" lang=3D"en-US">=C2=A0</p><p sty=
le=3D"margin:0in;font-size:11pt" lang=3D"en-US">=C2=A0</p><div style=3D"mar=
gin:0in;font-size:11pt">Btw. The
closest thing I could found is facebook migration 30PB hive warehouse:</div=
><p style=3D"margin:0in;font-size:11pt" lang=3D"en-US">=C2=A0</p><div style=
=3D"margin:0in;font-size:11pt"><a href=3D"https://www.facebook.com/notes/fa=
cebook-engineering/moving-an-elephant-large-scale-hadoop-data-migration-at-=
facebook/10150246275318920/" target=3D"_blank">https://www.facebook.com/not=
es/facebook-engineering/moving-an-elephant-large-scale-hadoop-data-migratio=
n-at-facebook/10150246275318920/</a></div><p style=3D"margin:0in;font-size:=
11pt" lang=3D"en-US">=C2=A0</p><div style=3D"margin:0in;font-size:11pt">The=
y
modifiy the distcp to do a initial bulk load (to better handling large file=
s
and very small files, for load balance I guess.) , and a replication system
(not much detail on this part) to mirror the changes.</div><p style=3D"marg=
in:0in;font-size:11pt" lang=3D"en-US">=C2=A0</p><div style=3D"margin:0in;fo=
nt-size:11pt">But it is
not clear that how they handle those shortcomings of distcp I mentioned abo=
ve.
And do they utilize snapshot mechanism.</div><p style=3D"margin:0in;font-si=
ze:11pt" lang=3D"en-US">=C2=A0</p><div style=3D"margin:0in;font-size:11pt">=
So , does
anyone have experience on this kind of work? What do you think might be the
best approaching for our case? Is there any ready works been done that we c=
an
utilize? Is there any works have been done around snapshot mechanism to eas=
y
data migration?</div>

</div></div></div></blockquote></div><br></div>

--001a114111bcb09ce8053046edaf--