hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Faris <afa...@linkedin.com>
Subject Re: Best practice to migrate HDFS from 0.20.205 to CDH3u3
Date Mon, 07 May 2012 14:37:00 GMT
Hi Austin,

I don't know about using CDH3, but we use distcp for moving data between different versions
of apache grids and several things come to mind.

1) you should use the -i flag to ignore checksum differences on the blocks.  I'm not 100%
but want to say hftp doesn't support checksums on the blocks as they go across the wire.

2) you should read from hftp but write to hdfs.  Also make sure to check your port numbers.
  For example I can read from hftp on port 50070 and write to hdfs on port 9000.  You'll find
the hftp port in hdfs-site.xml and hdfs in core-site.xml on apache releases.

3) Do you have security (kerberos) enabled on 0.20.205? Does CDH3 support security?  If security
is enabled on 0.20.205 and CDH3 does not support security, you will need to disable security
on 0.20.205.  This is because you are unable to write from a secure to unsecured grid.

4) use the -m flag to limit your mappers so you don't DDOS your network backbone.   

5) why isn't your vender helping you with the data migration? :)  

Otherwise something like this should get you going.

hadoop -i -ppgu -log /tmp/mylog -m 20 distcp hftp://mynamenode.grid.one:50070/path/to/my/src/data
hdfs://mynamenode.grid.two:9000/path/to/my/dst 

-- Adam

On May 7, 2012, at 4:29 AM, Nitin Pawar wrote:

> things to check
> 
> 1) when you launch distcp jobs all the datanodes of older hdfs are live and
> connected
> 2) when you launch distcp no data is being written/moved/deleteed in hdfs
> 3)  you can use option -log to log errors into directory and user -i to
> ignore errors
> 
> also u can try using distcp with hdfs protocol instead of hftp  ... for
> more you can refer
> https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/d0d99ad9f1554edd
> 
> 
> 
> if it failed there should be some error
> On Mon, May 7, 2012 at 4:44 PM, Austin Chungath <austincv@gmail.com> wrote:
> 
>> ok that was a lame mistake.
>> $ hadoop distcp hftp://localhost:50070/tmp hftp://localhost:60070/tmp_copy
>> I had spelled hdfs instead of "hftp"
>> 
>> $ hadoop distcp hftp://localhost:50070/docs/index.html
>> hftp://localhost:60070/user/hadoop
>> 12/05/07 16:38:09 INFO tools.DistCp:
>> srcPaths=[hftp://localhost:50070/docs/index.html]
>> 12/05/07 16:38:09 INFO tools.DistCp:
>> destPath=hftp://localhost:60070/user/hadoop
>> With failures, global counters are inaccurate; consider running with -i
>> Copy failed: java.io.IOException: Not supported
>> at org.apache.hadoop.hdfs.HftpFileSystem.delete(HftpFileSystem.java:457)
>> at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:963)
>> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:672)
>> at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>> at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>> 
>> Any idea why this error is coming?
>> I am copying one file from 0.20.205 (/docs/index.html ) to cdh3u3
>> (/user/hadoop)
>> 
>> Thanks & Regards,
>> Austin
>> 
>> On Mon, May 7, 2012 at 3:57 PM, Austin Chungath <austincv@gmail.com>
>> wrote:
>> 
>>> Thanks,
>>> 
>>> So I decided to try and move using distcp.
>>> 
>>> $ hadoop distcp hdfs://localhost:54310/tmp hdfs://localhost:8021/tmp_copy
>>> 12/05/07 14:57:38 INFO tools.DistCp:
>> srcPaths=[hdfs://localhost:54310/tmp]
>>> 12/05/07 14:57:38 INFO tools.DistCp:
>>> destPath=hdfs://localhost:8021/tmp_copy
>>> With failures, global counters are inaccurate; consider running with -i
>>> Copy failed: org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol
>>> org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. (client
>> =
>>> 63, server = 61)
>>> 
>>> I found that we can do distcp like above only if both are of the same
>>> hadoop version.
>>> so I tried:
>>> 
>>> $ hadoop distcp hftp://localhost:50070/tmp
>> hdfs://localhost:60070/tmp_copy
>>> 12/05/07 15:02:44 INFO tools.DistCp:
>> srcPaths=[hftp://localhost:50070/tmp]
>>> 12/05/07 15:02:44 INFO tools.DistCp:
>>> destPath=hdfs://localhost:60070/tmp_copy
>>> 
>>> But this process seemed to be hangs at this stage. What might I be doing
>>> wrong?
>>> 
>>> hftp://<dfs.http.address>/<path>
>>> hftp://localhost:50070 is dfs.http.address of 0.20.205
>>> hdfs://localhost:60070 is dfs.http.address of cdh3u3
>>> 
>>> Thanks and regards,
>>> Austin
>>> 
>>> 
>>> On Fri, May 4, 2012 at 4:30 AM, Michel Segel <michael_segel@hotmail.com
>>> wrote:
>>> 
>>>> Ok... So riddle me this...
>>>> I currently have a replication factor of 3.
>>>> I reset it to two.
>>>> 
>>>> What do you have to do to get the replication factor of 3 down to 2?
>>>> Do I just try to rebalance the nodes?
>>>> 
>>>> The point is that you are looking at a very small cluster.
>>>> You may want to start the be cluster with a replication factor of 2 and
>>>> then when the data is moved over, increase it to a factor of 3. Or maybe
>>>> not.
>>>> 
>>>> I do a distcp to. Copy the data and after each distcp, I do an fsck for
>> a
>>>> sanity check and then remove the files I copied. As I gain more room, I
>> can
>>>> then slowly drop nodes, do an fsck, rebalance and then repeat.
>>>> 
>>>> Even though this us a dev cluster, the OP wants to retain the data.
>>>> 
>>>> There are other options depending on the amount and size of new
>> hardware.
>>>> I mean make one machine a RAID 5 machine, copy data to it clearing off
>>>> the cluster.
>>>> 
>>>> If 8TB was the amount of disk used, that would be 2.6666 TB used.
>>>> Let's say 3TB. Going raid 5, how much disk is that?  So you could fit it
>>>> on one machine, depending on hardware, or maybe 2 machines...  Now you
>> can
>>>> rebuild initial cluster and then move data back. Then rebuild those
>>>> machines. Lots of options... ;-)
>>>> 
>>>> Sent from a remote device. Please excuse any typos...
>>>> 
>>>> Mike Segel
>>>> 
>>>> On May 3, 2012, at 11:26 AM, Suresh Srinivas <suresh@hortonworks.com>
>>>> wrote:
>>>> 
>>>>> This probably is a more relevant question in CDH mailing lists. That
>>>> said,
>>>>> what Edward is suggesting seems reasonable. Reduce replication factor,
>>>>> decommission some of the nodes and create a new cluster with those
>> nodes
>>>>> and do distcp.
>>>>> 
>>>>> Could you share with us the reasons you want to migrate from Apache
>> 205?
>>>>> 
>>>>> Regards,
>>>>> Suresh
>>>>> 
>>>>> On Thu, May 3, 2012 at 8:25 AM, Edward Capriolo <
>> edlinuxguru@gmail.com
>>>>> wrote:
>>>>> 
>>>>>> Honestly that is a hassle, going from 205 to cdh3u3 is probably more
>>>>>> or a cross-grade then an upgrade or downgrade. I would just stick
it
>>>>>> out. But yes like Michael said two clusters on the same gear and
>>>>>> distcp. If you are using RF=3 you could also lower your replication
>> to
>>>>>> rf=2 'hadoop dfs -setrepl 2' to clear headroom as you are moving
>>>>>> stuff.
>>>>>> 
>>>>>> 
>>>>>> On Thu, May 3, 2012 at 7:25 AM, Michel Segel <
>>>> michael_segel@hotmail.com>
>>>>>> wrote:
>>>>>>> Ok... When you get your new hardware...
>>>>>>> 
>>>>>>> Set up one server as your new NN, JT, SN.
>>>>>>> Set up the others as a DN.
>>>>>>> (Cloudera CDH3u3)
>>>>>>> 
>>>>>>> On your existing cluster...
>>>>>>> Remove your old log files, temp files on HDFS anything you don't
>> need.
>>>>>>> This should give you some more space.
>>>>>>> Start copying some of the directories/files to the new cluster.
>>>>>>> As you gain space, decommission a node, rebalance, add node to
new
>>>>>> cluster...
>>>>>>> 
>>>>>>> It's a slow process.
>>>>>>> 
>>>>>>> Should I remind you to make sure you up you bandwidth setting,
and
>> to
>>>>>> clean up the hdfs directories when you repurpose the nodes?
>>>>>>> 
>>>>>>> Does this make sense?
>>>>>>> 
>>>>>>> Sent from a remote device. Please excuse any typos...
>>>>>>> 
>>>>>>> Mike Segel
>>>>>>> 
>>>>>>> On May 3, 2012, at 5:46 AM, Austin Chungath <austincv@gmail.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Yeah I know :-)
>>>>>>>> and this is not a production cluster ;-) and yes there is
more
>>>> hardware
>>>>>>>> coming :-)
>>>>>>>> 
>>>>>>>> On Thu, May 3, 2012 at 4:10 PM, Michel Segel <
>>>> michael_segel@hotmail.com
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Well, you've kind of painted yourself in to a corner...
>>>>>>>>> Not sure why you didn't get a response from the Cloudera
lists,
>> but
>>>>>> it's a
>>>>>>>>> generic question...
>>>>>>>>> 
>>>>>>>>> 8 out of 10 TB. Are you talking effective storage or
actual disks?
>>>>>>>>> And please tell me you've already ordered more hardware..
Right?
>>>>>>>>> 
>>>>>>>>> And please tell me this isn't your production cluster...
>>>>>>>>> 
>>>>>>>>> (Strong hint to Strata and Cloudea... You really want
to accept my
>>>>>>>>> upcoming proposal talk... ;-)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Sent from a remote device. Please excuse any typos...
>>>>>>>>> 
>>>>>>>>> Mike Segel
>>>>>>>>> 
>>>>>>>>> On May 3, 2012, at 5:25 AM, Austin Chungath <austincv@gmail.com>
>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Yes. This was first posted on the cloudera mailing
list. There
>>>> were no
>>>>>>>>>> responses.
>>>>>>>>>> 
>>>>>>>>>> But this is not related to cloudera as such.
>>>>>>>>>> 
>>>>>>>>>> cdh3 is based on apache hadoop 0.20 as the base.
My data is in
>>>> apache
>>>>>>>>>> hadoop 0.20.205
>>>>>>>>>> 
>>>>>>>>>> There is an upgrade namenode option when we are migrating
to a
>>>> higher
>>>>>>>>>> version say from 0.20 to 0.20.205
>>>>>>>>>> but here I am downgrading from 0.20.205 to 0.20 (cdh3)
>>>>>>>>>> Is this possible?
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, May 3, 2012 at 3:25 PM, Prashant Kommireddi
<
>>>>>> prash1784@gmail.com
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Seems like a matter of upgrade. I am not a Cloudera
user so
>> would
>>>> not
>>>>>>>>> know
>>>>>>>>>>> much, but you might find some help moving this
to Cloudera
>> mailing
>>>>>> list.
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, May 3, 2012 at 2:51 AM, Austin Chungath
<
>>>> austincv@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> There is only one cluster. I am not copying
between clusters.
>>>>>>>>>>>> 
>>>>>>>>>>>> Say I have a cluster running apache 0.20.205
with 10 TB storage
>>>>>>>>> capacity
>>>>>>>>>>>> and has about 8 TB of data.
>>>>>>>>>>>> Now how can I migrate the same cluster to
use cdh3 and use that
>>>>>> same 8
>>>>>>>>> TB
>>>>>>>>>>>> of data.
>>>>>>>>>>>> 
>>>>>>>>>>>> I can't copy 8 TB of data using distcp because
I have only 2 TB
>>>> of
>>>>>> free
>>>>>>>>>>>> space
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, May 3, 2012 at 3:12 PM, Nitin Pawar
<
>>>>>> nitinpawar432@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> you can actually look at the distcp
>>>>>>>>>>>>> 
>>>>>>>>>>>>> http://hadoop.apache.org/common/docs/r0.20.0/distcp.html
>>>>>>>>>>>>> 
>>>>>>>>>>>>> but this means that you have two different
set of clusters
>>>>>> available
>>>>>>>>> to
>>>>>>>>>>>> do
>>>>>>>>>>>>> the migration
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, May 3, 2012 at 12:51 PM, Austin
Chungath <
>>>>>> austincv@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for the suggestions,
>>>>>>>>>>>>>> My concerns are that I can't actually
copyToLocal from the
>> dfs
>>>>>>>>>>> because
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> data is huge.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Say if my hadoop was 0.20 and I am
upgrading to 0.20.205 I
>> can
>>>> do
>>>>>> a
>>>>>>>>>>>>>> namenode upgrade. I don't have to
copy data out of dfs.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> But here I am having Apache hadoop
0.20.205 and I want to use
>>>> CDH3
>>>>>>>>>>> now,
>>>>>>>>>>>>>> which is based on 0.20
>>>>>>>>>>>>>> Now it is actually a downgrade as
0.20.205's namenode info
>> has
>>>> to
>>>>>> be
>>>>>>>>>>>> used
>>>>>>>>>>>>>> by 0.20's namenode.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Any idea how I can achieve what I
am trying to do?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Thu, May 3, 2012 at 12:23 PM,
Nitin Pawar <
>>>>>>>>>>> nitinpawar432@gmail.com
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> i can think of following options
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 1) write a simple get and put
code which gets the data from
>>>> DFS
>>>>>> and
>>>>>>>>>>>>> loads
>>>>>>>>>>>>>>> it in dfs
>>>>>>>>>>>>>>> 2) see if the distcp  between
both versions are compatible
>>>>>>>>>>>>>>> 3) this is what I had done (and
my data was hardly few
>> hundred
>>>>>> GB)
>>>>>>>>>>> ..
>>>>>>>>>>>>>> did a
>>>>>>>>>>>>>>> dfs -copyToLocal and then in
the new grid did a
>> copyFromLocal
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Thu, May 3, 2012 at 11:41
AM, Austin Chungath <
>>>>>>>>>>> austincv@gmail.com
>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>> I am migrating from Apache
hadoop 0.20.205 to CDH3u3.
>>>>>>>>>>>>>>>> I don't want to lose the
data that is in the HDFS of Apache
>>>>>>>>>>> hadoop
>>>>>>>>>>>>>>>> 0.20.205.
>>>>>>>>>>>>>>>> How do I migrate to CDH3u3
but keep the data that I have on
>>>>>>>>>>>> 0.20.205.
>>>>>>>>>>>>>>>> What is the best practice/
techniques to do this?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks & Regards,
>>>>>>>>>>>>>>>> Austin
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Nitin Pawar
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Nitin Pawar
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Nitin Pawar


Mime
View raw message