hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jorn Argelo - Ephorus" <Jorn.Arg...@ephorus.com>
Subject RE: CopyTable to remote cluster runs OK but doesn't copy anything
Date Thu, 08 Dec 2011 12:40:12 GMT
Hi all,

To follow up on this,
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication is
having exactly the same behaviour as CopyTable. 

Jorn

-----Oorspronkelijk bericht-----
Van: Jorn Argelo - Ephorus [mailto:Jorn.Argelo@ephorus.com] 
Verzonden: donderdag 8 december 2011 9:59
Aan: user@hbase.apache.org
Onderwerp: RE: CopyTable to remote cluster runs OK but doesn't copy
anything

Hi Jon / J-D,

Yeah, I had a bunch of additional stuff in my classpath which we needed
for other M/R jobs:
/etc/zookeeper:/etc/hadoop-0.20/conf:/usr/lib/hadoop-0.20/*:/usr/lib/had
oop-0.20/lib/*:/usr/lib/zookeeper/*:/usr/lib/zookeeper/lib/*

I tried just removing /etc/zookeeper from the classpath but then I still
had the same result. After removing that whole line from the classpath I
ended up with a working CopyTable. I could see that the MapReduce job
was now caching jars in /tmp which it didn't do before.

Maybe it's worthwhile to add this info the HBASE-4614? Let me know if
there's any way I can assist with testing.

Thanks a lot for your support.

Jorn

-----Oorspronkelijk bericht-----
Van: Jonathan Hsieh [mailto:jon@cloudera.com] 
Verzonden: woensdag 7 december 2011 19:09
Aan: user@hbase.apache.org
Onderwerp: Re: CopyTable to remote cluster runs OK but doesn't copy
anything

Jorn,

I recently ran into this problem.  The CopyTable it actually is copying
data to the same instance of the table, and likely because an hbase
client
in the MR job is picking up the settings from a zoo.cfg file.

Have you added `hbase classpath` to your hadoop-env.sh file?  Can you
check
if  zoo.cfg (possibly as /etc/zookeeper/* in CDH) is in the class path
of
the task trackers..

If it is, you may want to remove it from there and then add the ZK
settings
to your hbase-site.conf file.

Jon.

On Wed, Dec 7, 2011 at 9:31 AM, Jean-Daniel Cryans
<jdcryans@apache.org>wrote:

> It would most likely be this bug:
> https://issues.apache.org/jira/browse/HBASE-4614
>
> On Wed, Dec 7, 2011 at 12:27 AM, Jorn Argelo - Ephorus
> <Jorn.Argelo@ephorus.com> wrote:
> > Hi all,
> >
> >
> >
> > I'm trying to copy a table from one cluster to another cluster but
this
> > does not seem to do what I expect it to do. The Map/Reduce job runs
> > successfully as you can see below, but it's not actually copying
> > anything to the remote cluster. It almost looks as if it's not
parsing
> > the --peer.adr option and just copies the data inside the same
cluster.
> > At least, the "WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the
same"
> > warning would suggest that.
> >
> >
> >
> > Both clusters are running CHD3U1 and are both fully distributed,
> > although hbase-test1 is a single physical server running all
components
> > for a fully distributed setup. The source cluster where I am running
the
> > job from is a small 10 node cluster. Note that on hbase-test1 the
target
> > table already exists with the same column families as in the source
> > cluster.
> >
> >
> >
> > Does anybody have any idea what I'm doing wrong? Or maybe I found a
bug?
> > There's another guy at stackoverflow reporting the same issue
> >
(http://stackoverflow.com/questions/7952213/how-to-copy-a-table-from-one
> > -hbase-cluster-to-another-cluster) but nobody responded on that.
> >
> >
> >
> > Thanks,
> >
> > Jorn
> >
> >
> >
> >
> >
> > $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> > --peer.adr=hbase-test1:2181:/hbase chunk
> >
> > 11/12/07 08:52:24 WARN mapred.JobClient: Use GenericOptionsParser
for
> > parsing the arguments. Applications should implement Tool for the
same.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011
16:48
> > GMT
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:host.name=namenode1
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.version=1.6.0_26
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.vendor=Sun Microsystems Inc.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.class.path=<lots of jars, snipped out to prevent
spam>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> >
environment:java.library.path=/usr/lib/hbase/bin/../lib/native/Linux-amd
> > 64-64:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.io.tmpdir=/tmp
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.compiler=<NA>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.name=Linux
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.arch=amd64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.version=2.6.32-33-server
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.name=mapred
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.home=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.dir=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=hbase-test1:2181 sessionTimeout=10000
> > watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket
connection
> > to server hbase-test1/10.30.10.10:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to hbase-test1/10.30.10.10:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server hbase-test1/10.30.10.10:2181, sessionid =
> > 0x134126b2d250040, negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> >
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@105691e; hsa=hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > .META.,,1.1028785192 is hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > chunk,,1323181597686.f527a21a31a39559a2f4cbd034d286a7. is
> > hbase-test1:60020
> >
> > 11/12/07 08:52:25 INFO mapreduce.TableOutputFormat: Created table
> > instance for chunk
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=z01:2181,zk02:2181,zk03:2181
> > sessionTimeout=10000 watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket
connection
> > to server zk02/10.30.4.93:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to zk02/10.30.4.93:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server zzk02/10.30.4.93:2181, sessionid =
0x233922a0b320c81,
> > negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> >
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@3c3a1834; hsa=datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > .META.,,1.1028785192 is datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > chunk,,1323179990451.a243a485325744b9eedd8da2106712b6. is
> > datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,array_for_lithiumion_battery_anode_material,1323179990451.6161771e
> > cadc7a45acd28afbfca88a09. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,cytotoxic_elements_were_leached_from_the,1323179991855.5cf99f0425d
> > 9e0e9fd41ac9645d65b93. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,generation_cephalosporin_just_before_biopsy,1323179991855.db0d0df2
> > c2e076bf01c75ae6ac200436. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,linked_with_the_2008_great_wenchuan,1323179964329.32ee9e359e50582b
> > f1b419396c9aa8ad. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,pca_with_gabor_decomposition_offered_several,1323179964329.ed0b5be
> > c77229df3b6a5a08c117db355. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,see_34\xE2\x80\x93_43however_when_many_partial,1323179993570.cf10c
> > 9b3a05b1b26437c674af2b61cfc. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,the_wheelsground_contact_between_vehicle_tire,1323179993570.ee5baf
> > fa6cdce4ced50c6a9c10beca75. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=2147483647 rows
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 0 -> datanode3:,array_for_lithiumion_battery_anode_material
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 1 ->
> >
datanode3:array_for_lithiumion_battery_anode_material,cytotoxic_elements
> > _were_leached_from_the
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 2 ->
> >
datanode3:cytotoxic_elements_were_leached_from_the,generation_cephalospo
> > rin_just_before_biopsy
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 3 ->
> >
datanode3:generation_cephalosporin_just_before_biopsy,linked_with_the_20
> > 08_great_wenchuan
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 4 ->
> >
datanode2:linked_with_the_2008_great_wenchuan,pca_with_gabor_decompositi
> > on_offered_several
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 5 ->
> >
datanode2:pca_with_gabor_decomposition_offered_several,see_34\xE2\x80\x9
> > 3_43however_when_many_partial
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 6 ->
> >
datanode3:see_34\xE2\x80\x93_43however_when_many_partial,the_wheelsgroun
> > d_contact_between_vehicle_tire
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 7 -> datanode3:the_wheelsground_contact_between_vehicle_tire,
> >
> > 11/12/07 08:52:25 INFO mapred.JobClient: Running job:
> > job_201111021158_0026
> >
> > 11/12/07 08:52:26 INFO mapred.JobClient:  map 0% reduce 0%
> >
> > 11/12/07 08:56:22 INFO mapred.JobClient:  map 12% reduce 0%
> >
> > 11/12/07 08:56:53 INFO mapred.JobClient:  map 25% reduce 0%
> >
> > 11/12/07 08:59:05 INFO mapred.JobClient:  map 37% reduce 0%
> >
> > 11/12/07 08:59:51 INFO mapred.JobClient:  map 50% reduce 0%
> >
> > 11/12/07 09:00:31 INFO mapred.JobClient:  map 62% reduce 0%
> >
> > 11/12/07 09:00:35 INFO mapred.JobClient:  map 75% reduce 0%
> >
> > 11/12/07 09:00:43 INFO mapred.JobClient:  map 87% reduce 0%
> >
> > 11/12/07 09:01:02 INFO mapred.JobClient:  map 100% reduce 0%
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Job complete:
> > job_201111021158_0026
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Counters: 13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Job Counters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:
SLOTS_MILLIS_MAPS=3306288
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > reduces waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > maps waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Rack-local map tasks=8
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Launched map tasks=13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Data-local map tasks=5
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   FileSystemCounters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     HDFS_BYTES_READ=1254
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:
FILE_BYTES_WRITTEN=523502
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Map-Reduce Framework
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map input
records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Spilled Records=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map output
records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1254
> >
> >
> >
> >
> >
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Mime
View raw message