hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Hsieh <...@cloudera.com>
Subject Re: CopyTable to remote cluster runs OK but doesn't copy anything
Date Wed, 07 Dec 2011 18:08:32 GMT
Jorn,

I recently ran into this problem.  The CopyTable it actually is copying
data to the same instance of the table, and likely because an hbase client
in the MR job is picking up the settings from a zoo.cfg file.

Have you added `hbase classpath` to your hadoop-env.sh file?  Can you check
if  zoo.cfg (possibly as /etc/zookeeper/* in CDH) is in the class path of
the task trackers..

If it is, you may want to remove it from there and then add the ZK settings
to your hbase-site.conf file.

Jon.

On Wed, Dec 7, 2011 at 9:31 AM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> It would most likely be this bug:
> https://issues.apache.org/jira/browse/HBASE-4614
>
> On Wed, Dec 7, 2011 at 12:27 AM, Jorn Argelo - Ephorus
> <Jorn.Argelo@ephorus.com> wrote:
> > Hi all,
> >
> >
> >
> > I'm trying to copy a table from one cluster to another cluster but this
> > does not seem to do what I expect it to do. The Map/Reduce job runs
> > successfully as you can see below, but it's not actually copying
> > anything to the remote cluster. It almost looks as if it's not parsing
> > the --peer.adr option and just copies the data inside the same cluster.
> > At least, the "WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the same"
> > warning would suggest that.
> >
> >
> >
> > Both clusters are running CHD3U1 and are both fully distributed,
> > although hbase-test1 is a single physical server running all components
> > for a fully distributed setup. The source cluster where I am running the
> > job from is a small 10 node cluster. Note that on hbase-test1 the target
> > table already exists with the same column families as in the source
> > cluster.
> >
> >
> >
> > Does anybody have any idea what I'm doing wrong? Or maybe I found a bug?
> > There's another guy at stackoverflow reporting the same issue
> > (http://stackoverflow.com/questions/7952213/how-to-copy-a-table-from-one
> > -hbase-cluster-to-another-cluster) but nobody responded on that.
> >
> >
> >
> > Thanks,
> >
> > Jorn
> >
> >
> >
> >
> >
> > $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> > --peer.adr=hbase-test1:2181:/hbase chunk
> >
> > 11/12/07 08:52:24 WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the same.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011 16:48
> > GMT
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:host.name=namenode1
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.version=1.6.0_26
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.vendor=Sun Microsystems Inc.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.class.path=<lots of jars, snipped out to prevent spam>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.library.path=/usr/lib/hbase/bin/../lib/native/Linux-amd
> > 64-64:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.io.tmpdir=/tmp
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.compiler=<NA>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.name=Linux
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.arch=amd64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.version=2.6.32-33-server
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.name=mapred
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.home=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.dir=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=hbase-test1:2181 sessionTimeout=10000
> > watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket connection
> > to server hbase-test1/10.30.10.10:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to hbase-test1/10.30.10.10:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server hbase-test1/10.30.10.10:2181, sessionid =
> > 0x134126b2d250040, negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> > connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@105691e; hsa=hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > .META.,,1.1028785192 is hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,,1323181597686.f527a21a31a39559a2f4cbd034d286a7. is
> > hbase-test1:60020
> >
> > 11/12/07 08:52:25 INFO mapreduce.TableOutputFormat: Created table
> > instance for chunk
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=z01:2181,zk02:2181,zk03:2181
> > sessionTimeout=10000 watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket connection
> > to server zk02/10.30.4.93:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to zk02/10.30.4.93:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server zzk02/10.30.4.93:2181, sessionid = 0x233922a0b320c81,
> > negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> > connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@3c3a1834; hsa=datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > .META.,,1.1028785192 is datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,,1323179990451.a243a485325744b9eedd8da2106712b6. is
> > datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,array_for_lithiumion_battery_anode_material,1323179990451.6161771e
> > cadc7a45acd28afbfca88a09. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,cytotoxic_elements_were_leached_from_the,1323179991855.5cf99f0425d
> > 9e0e9fd41ac9645d65b93. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,generation_cephalosporin_just_before_biopsy,1323179991855.db0d0df2
> > c2e076bf01c75ae6ac200436. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,linked_with_the_2008_great_wenchuan,1323179964329.32ee9e359e50582b
> > f1b419396c9aa8ad. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,pca_with_gabor_decomposition_offered_several,1323179964329.ed0b5be
> > c77229df3b6a5a08c117db355. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,see_34\xE2\x80\x93_43however_when_many_partial,1323179993570.cf10c
> > 9b3a05b1b26437c674af2b61cfc. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,the_wheelsground_contact_between_vehicle_tire,1323179993570.ee5baf
> > fa6cdce4ced50c6a9c10beca75. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
> > row=chunk,,00000000000000 for max=2147483647 rows
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 0 -> datanode3:,array_for_lithiumion_battery_anode_material
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 1 ->
> > datanode3:array_for_lithiumion_battery_anode_material,cytotoxic_elements
> > _were_leached_from_the
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 2 ->
> > datanode3:cytotoxic_elements_were_leached_from_the,generation_cephalospo
> > rin_just_before_biopsy
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 3 ->
> > datanode3:generation_cephalosporin_just_before_biopsy,linked_with_the_20
> > 08_great_wenchuan
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 4 ->
> > datanode2:linked_with_the_2008_great_wenchuan,pca_with_gabor_decompositi
> > on_offered_several
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 5 ->
> > datanode2:pca_with_gabor_decomposition_offered_several,see_34\xE2\x80\x9
> > 3_43however_when_many_partial
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 6 ->
> > datanode3:see_34\xE2\x80\x93_43however_when_many_partial,the_wheelsgroun
> > d_contact_between_vehicle_tire
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 7 -> datanode3:the_wheelsground_contact_between_vehicle_tire,
> >
> > 11/12/07 08:52:25 INFO mapred.JobClient: Running job:
> > job_201111021158_0026
> >
> > 11/12/07 08:52:26 INFO mapred.JobClient:  map 0% reduce 0%
> >
> > 11/12/07 08:56:22 INFO mapred.JobClient:  map 12% reduce 0%
> >
> > 11/12/07 08:56:53 INFO mapred.JobClient:  map 25% reduce 0%
> >
> > 11/12/07 08:59:05 INFO mapred.JobClient:  map 37% reduce 0%
> >
> > 11/12/07 08:59:51 INFO mapred.JobClient:  map 50% reduce 0%
> >
> > 11/12/07 09:00:31 INFO mapred.JobClient:  map 62% reduce 0%
> >
> > 11/12/07 09:00:35 INFO mapred.JobClient:  map 75% reduce 0%
> >
> > 11/12/07 09:00:43 INFO mapred.JobClient:  map 87% reduce 0%
> >
> > 11/12/07 09:01:02 INFO mapred.JobClient:  map 100% reduce 0%
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Job complete:
> > job_201111021158_0026
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Counters: 13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Job Counters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=3306288
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > reduces waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > maps waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Rack-local map tasks=8
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Launched map tasks=13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Data-local map tasks=5
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   FileSystemCounters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     HDFS_BYTES_READ=1254
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=523502
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Map-Reduce Framework
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map input records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Spilled Records=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map output records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1254
> >
> >
> >
> >
> >
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message