hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matteo Bertozzi <theo.berto...@gmail.com>
Subject Re: ExportSnapshot very slow. bug?
Date Fri, 08 Nov 2013 19:06:55 GMT
The first copy doesn't resolve the links, so you're copying empty files.
The data copy is only on "step 2" with the MR job

Matteo



On Fri, Nov 8, 2013 at 10:54 AM, Bryan Beaudreault <bbeaudreault@hubspot.com
> wrote:

> Hello all.  I'm trying out the ExportSnapshot tool and it is extremely
> slow.  I took a look at the code and I think I know why.
>
>
> https://github.com/cloudera/hbase/blob/cdh4-0.94.6_4.4.0/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#L635
>
> In step 1 it is for some reason copying from fs1 to fs2.  This basically
> means in a single threaded process we are copying an entire hbase table to
> another cluster.  I can understand wanting to copy from fs1 to fs1 (i.e.
> different path on same fs), so as to dereference all the soft links of the
> snapshots.  But why between filesystems?
>
> In step 2 you finally do the MR job, which makes much more sense, but as
> far as I can tell all of the files would already exist, as FileUtils.copy
> just does a recursive copy of all paths in a tree.
>
> Am I missing something?  I appreciate any input.
>
> - Bryan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message