hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matteo Bertozzi (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-7253) Compaction Tool
Date Mon, 03 Dec 2012 20:39:58 GMT

     [ https://issues.apache.org/jira/browse/HBASE-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matteo Bertozzi updated HBASE-7253:
-----------------------------------

    Release Note: 
The CompactionTool works at file-system level, so the table should be disabled.

The compaction process uses the same hbase-site.xml configuration property used by the server,
like 
"hbase.hstore.compactionThreshold" & co.

You can compact the whole table or just a single region or family,
and the input of the CompactionTool is a fs path.

You can run the compaction as a MapReduce Job, or as a local process.
Each family can be compacted in parallel if you use the -mapreduce option.

To compact "TestTable" family "cf1" of region "e450da04b1a10099b618bec031e0f951"
bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool hdfs:///hbase/TestTable/e450da04b1a10099b618bec031e0f951/cf1

To compact all the families of region "e450da04b1a10099b618bec031e0f951":
bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool hdfs:///hbase/TestTable/e450da04b1a10099b618bec031e0f951

To compact all regions and family of the Table:
bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred hdfs:///hbase/TestTable

  was:
Tool to run compactions external to hbase:

Usage: java " + this.getClass().getName() +  [-compactOnce] [-mapred] [-D<property=value>]*
files...
Options:
 mapred         Use MapReduce to run compaction.
 compactOnce    Execute just one compaction step. (default: while needed)
Note: -D properties will be applied to the conf used.
For example:
 To preserve input files, pass -D"+CONF_COMPLETE_COMPACTION+"=false"
 To stop delete of compacted file, pass -D"+CONF_DELETE_COMPACTED+"=false"
 To set tmp dir, pass -D"+CONF_TMP_DIR+"=ALTERNATE_DIR"

Examples:
 To compact the full 'TestTable' using MapReduce:
 $ bin/hbase " + this.getClass().getName() + " -mapred hdfs:///hbase/TestTable"
 To compact column family 'x' of the table 'TestTable' region 'abc':                     
                                                                           
  $ bin/hbase " + this.getClass().getName() + " hdfs:///hbase/TestTable/abc/x"

    Hadoop Flags:   (was: Reviewed)
    
> Compaction Tool
> ---------------
>
>                 Key: HBASE-7253
>                 URL: https://issues.apache.org/jira/browse/HBASE-7253
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>             Fix For: 0.96.0
>
>         Attachments: HBASE-7253-v0.patch, HBASE-7253-v1.patch
>
>
> In HBASE-5616, as part of the compaction code refactor, a CompactionTool was added.
> but there are some issues:
> * The tool is under test/
> * mockito is required, so the "test" scope should be removed from the pom.xml, otherwise
the tool doesn't start
> * The mock, used by the tool, is mocking HRegion.getRegionInfo() but some code (Store)
uses HRegion.regionInfo directly HStore.java#L2021,  HStore.java#L1389, HStore.java#L1402
and you end up with a NPE in the tool.
> * The Mocked Store uses a dummy family and the compacted files doesn't get the same family
properties specified (compression, encoding, ...)
> * at the end of compaction CompactionTool.java#L155, on by default, the compaction file
is removed (note that the compacted one are already removed inside the store.compact()...
and you end up with an empty dir, if you compact everything.
> I've fixed some stuff and added support to:
>  * Run the compaction as a MR Job
>  * Specify a Table (compact each region/family)
>  * Specify a Region (compact each family)
>  * Specify a Family (as before)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message