hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roshan Naik (JIRA)" <>
Subject [jira] [Created] (HIVE-5143) Streaming - Compaction of partitions
Date Fri, 23 Aug 2013 01:09:51 GMT
Roshan Naik created HIVE-5143:

             Summary: Streaming - Compaction of partitions
                 Key: HIVE-5143
             Project: Hive
          Issue Type: Sub-task
            Reporter: Roshan Naik
            Assignee: Roshan Naik

Task is to support compaction of partitions.

Rationale: Streaming partitions are composed of a large number of small files (each commit
is one file). Since compaction can be a potentially expensive operation (for e.g. converting
to single ORC file), we do not compact the streaming partition at the time of rolling it into
a standard partition. This allows rolling to be quick and atomic.

Compaction will be performed at a later time. The streaming partition is converted as is (typically
with a many small files) into a standard partition. This new standard partition will be queued
up for compaction by a separate job.

This decouples the compaction feature from streaming support, and makes it more generally
available for any partitions.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message