hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-13354) Add ability to specify Compaction options per table and per request
Date Tue, 24 May 2016 05:26:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297708#comment-15297708
] 

Wei Zheng commented on HIVE-13354:
----------------------------------

Thanks [~ekoifman] for the review.
1. I moved the setConf later to make it clearer.
2. You're right. "ready for cleaning" is due to the SQL failure in CompactionTxnHandler. After
fixing the unmatching "?"s, I got "succeeded" response.
3. "size4" is due to the serialization scheme of jobConf (4 being the length of 8192). The
complete output of job.get("hive.compactor.table.props") is this:
{code}
11:9:totalSize4:207617:orc.compress.size4:819253:compactorthreshold.hive.compactor.delta.pct.threshold3:0.57:numRows1:711:rawDataSize1:021:COLUMN_STATS_ACCURATE22:{"BASIC_STATS":"true"}53:compactorthreshold.hive.compactor.delta.num.threshold1:48:numFiles1:421:transient_lastDdlTime10:146403755713:transactional4:true33:compactor.mapreduce.map.memory.mb4:2048
{code}
4. Deprecated the old compact() signature.
5. Fixed unmatching number of value entries in insert statement.
6. Removed cc_tblproperties from purgeCompactionHistory().

> Add ability to specify Compaction options per table and per request
> -------------------------------------------------------------------
>
>                 Key: HIVE-13354
>                 URL: https://issues.apache.org/jira/browse/HIVE-13354
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 1.3.0, 2.0.0
>            Reporter: Eugene Koifman
>            Assignee: Wei Zheng
>              Labels: TODOC2.1
>         Attachments: HIVE-13354.1.patch, HIVE-13354.1.withoutSchemaChange.patch, HIVE-13354.2.patch
>
>
> Currently the are a few options that determine when automatic compaction is triggered.
 They are specified once for the warehouse.
> This doesn't make sense - some table may be more important and need to be compacted more
often.
> We should allow specifying these on per table basis.
> Also, compaction is an MR job launched from within the metastore.  There is currently
no way to control job parameters (like memory, for example) except to specify it in hive-site.xml
for metastore which means they are site wide.
> Should add a way to specify these per table (perhaps even per compaction if launched
via ALTER TABLE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message