hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sachin Pasalkar <>
Subject How to enable compaction for table with external data?
Date Mon, 14 Sep 2015 10:03:31 GMT


We are writing direct orc file from storm topology instead of using hive streaming (Due to
performance issue with our data). However, we want to compact the data. So we have added the
"NO_AUTO_COMPACTION"=“false” option in table which we created to read data(1.6 GB scattered
in multiple small files) in ORC file. Does “NO_AUTO_COMPACTION” means it will not compact
data while hive streaming is used? If no, why it did not compact our data into 1 file?

We also tried manually calling compaction from java code using org.apache.hadoop.hive.metastore.txn.TxnHandler’s
compact API which shows it has started compaction, when we execute command Show compactions.
But still does not work.  I don’t want to execute the manual commands from command line.

Is there any way?

PS: We are writing all files in one directory only.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message