hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiang Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-16959) Export snapshot to local file system of a single node
Date Fri, 28 Oct 2016 03:18:58 GMT

     [ https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xiang Li updated HBASE-16959:
-----------------------------
    Description: 
ExportSnapshot allows uses to specify "file://" in "copy-to".
Based on the implementation (use Map jobs), it works as follow
(1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of
the HBase client node where the command is issued
(2) The data of the snapshot(archive)  is exported to the local file system of the nodes where
the map jobs run, so spread everywhere.

*That causes 2 problems we meet so far:*
(1) The last step to verify the snapshot integrity fails, due to that not all the data can
be found on the HBase client node where the command is issued. "-no-target-verify" can be
of help here to suppress the verification, but it is not a good idea
(2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and
it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory
will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is
not controlled) will try to write data into it. If the directory permission is not set properly,
let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with
no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write
data into the "copy-to" directory, as it is created by user=hbase. We have the following exception
{code}
Error: java.io.IOException: Mkdirs failed to create file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info
(exists=false, cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_000004)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275)
	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193)
	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
{code}
We can control the permission to resolve that, but it is not a good idea either.

*Proposal*
If exporting to "file://", add reduce to aggregate all "distributed" data of the snapshot
to the HBase client node where the command is issued, to be together with the manifest of
the snapshot. That can resolve the verification problem above in (1)
For problem (2), have no idea so far

  was:
ExportSnapshot allows uses to specify "file://" in "copy-to".
Based on the implementation (use Map jobs), it works as follow
(1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of
the HBase client node where the command is issued
(2) The data of the snapshot(archive)  is exported to the local file system of the nodes where
the map jobs run, so spread everywhere.

*That causes 2 problems we meet so far:*
(1) The last step to verify the snapshot integrity fails, due to that not all the data can
be found on the HBase client node where the command is issued. "-no-target-verify" can be
of help here to suppress the verification, but it is not a good idea
(2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and
it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory
will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is
not controlled) will try to write data into it. If the directory permission is not set properly,
let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with
no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write
data into the "copy-to" directory, as it is created by user=hbase. We have the following exception
{code}
Error: java.io.IOException: Mkdirs failed to create file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info
(exists=false, cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_000004)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275)
	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193)
	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
{code}
We can control the permission to resolve that, but it is not a good idea either.

*Propoal*
If exporting to "file://", add reduce to aggregate all "distributed" data of the snapshot
to the HBase client node where the command is issued, to be together with the manifest of
the snapshot. That can resolve the verification problem above in (1)
For problem (2), have no idea so far


> Export snapshot to local file system of a single node
> -----------------------------------------------------
>
>                 Key: HBASE-16959
>                 URL: https://issues.apache.org/jira/browse/HBASE-16959
>             Project: HBase
>          Issue Type: New Feature
>          Components: snapshots
>            Reporter: Xiang Li
>            Priority: Critical
>
> ExportSnapshot allows uses to specify "file://" in "copy-to".
> Based on the implementation (use Map jobs), it works as follow
> (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system
of the HBase client node where the command is issued
> (2) The data of the snapshot(archive)  is exported to the local file system of the nodes
where the map jobs run, so spread everywhere.
> *That causes 2 problems we meet so far:*
> (1) The last step to verify the snapshot integrity fails, due to that not all the data
can be found on the HBase client node where the command is issued. "-no-target-verify" can
be of help here to suppress the verification, but it is not a good idea
> (2) When the HBase client (where the command is issued) is also a NodeManager of Yarn,
and it happens to have a map job (to write data of snapshot) running on it, the "copy-to"
directory will be created firstly when writing the manifest by user=hbase and then user=yarn(if
it is not controlled) will try to write data into it. If the directory permission is not set
properly, let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is
created with no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn
can not write data into the "copy-to" directory, as it is created by user=hbase. We have the
following exception
> {code}
> Error: java.io.IOException: Mkdirs failed to create file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info
(exists=false, cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_000004)
> 	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
> 	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
> 	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
> 	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
> 	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
> 	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275)
> 	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193)
> 	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {code}
> We can control the permission to resolve that, but it is not a good idea either.
> *Proposal*
> If exporting to "file://", add reduce to aggregate all "distributed" data of the snapshot
to the HBase client node where the command is issued, to be together with the manifest of
the snapshot. That can resolve the verification problem above in (1)
> For problem (2), have no idea so far



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message