hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <alanfga...@gmail.com>
Subject Re: Hive compaction didn't launch
Date Thu, 28 Jul 2016 16:06:59 GMT
Hive is doing the right thing there, as it cannot compact the deltas into a base file while
there are still open transactions in the delta.  Storm should be committing on some frequency
even if it doesn’t have enough data to commit.

Alan.

> On Jul 28, 2016, at 05:36, Igor Kuzmenko <f1sherox@gmail.com> wrote:
> 
> I made some research on that issue.
> The problem is in ValidCompactorTxnList::isTxnRangeValid method.
> 
> Here's code:
> @Override
> public RangeResponse isTxnRangeValid(long minTxnId, long maxTxnId) {
>   if (highWatermark < minTxnId) {
>     return RangeResponse.NONE;
>   } else if (minOpenTxn < 0) {
>     return highWatermark >= maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
>   } else {
>     return minOpenTxn > maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
>   }
> }
> 
> In my case this method returned RangeResponce.NONE for most of delta files. With this
value delta file doesn't include in compaction.
> 
> Last 'else' bock compare minOpenTxn to maxTxnId and if maxTxnId bigger return RangeResponce.NONE,
thats a problem for me, because of using Storm Hive Bolt. Hive Bolt gets transaction and maintain
it open with heartbeat until there's data to commit.
> 
> So if i get transaction and maintain it open all compactions will stop. Is it incorrect
Hive behavior, or Storm should close transaction?
> 
> 
> 
> 
> On Wed, Jul 27, 2016 at 8:46 PM, Igor Kuzmenko <f1sherox@gmail.com> wrote:
> Thanks for reply, Alan. My guess with Storm was wrong. Today I get same behavior with
running Storm topology. 
> Anyway, I'd like to know, how can I check that transaction batch was closed correctly?
> 
> On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates <alanfgates@gmail.com> wrote:
> I don’t know the details of how the storm application that streams into Hive works,
but this sounds like the transaction batches weren’t getting closed.  Compaction can’t
happen until those batches are closed.  Do you know how you had storm configured?  Also, you
might ask separately on the storm list to see if people have seen this issue before.
> 
> Alan.
> 
> > On Jul 27, 2016, at 03:31, Igor Kuzmenko <f1sherox@gmail.com> wrote:
> >
> > One more thing. I'm using Apache Storm to stream data in Hive. And when I turned
off Storm topology compactions started to work properly.
> >
> > On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1sherox@gmail.com> wrote:
> > I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive Streaming
API. After some time i expect compaction to start but it didn't happen:
> >
> > Here's part of log, which shows that compactor initiator thread doesn't see any
delta files:
> > 2016-07-26 18:06:52,459 INFO  [Thread-8]: compactor.Initiator (Initiator.java:run(89))
- Checking to see if we should compact default.data_aaa.dt=20160726
> > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils (AcidUtils.java:getAcidState(432))
- in directory hdfs://sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726
base = null deltas = 0
> > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator (Initiator.java:determineCompactionType(271))
- delta size: 0 base size: 0 threshold: 0.1 will major compact: false
> >
> > But in that directory there's actually 23 files:
> >
> > hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726
> > Found 23 items
> > -rw-r--r--   3 storm hdfs          4 2016-07-26 17:20 /apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:22 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:23 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:25 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:26 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:27 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:29 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:30 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:32 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:33 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:34 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71891556_71891655
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:36 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71904856_71904955
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:37 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71907256_71907355
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:39 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71918756_71918855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:40 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71947556_71947655
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:41 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71960656_71960755
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:43 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71963156_71963255
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:44 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71964556_71964655
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:46 /apps/hive/warehouse/data_aaa/dt=20160726/delta_71987156_71987255
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:47 /apps/hive/warehouse/data_aaa/dt=20160726/delta_72015756_72015855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:48 /apps/hive/warehouse/data_aaa/dt=20160726/delta_72021356_72021455
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:50 /apps/hive/warehouse/data_aaa/dt=20160726/delta_72048756_72048855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:50 /apps/hive/warehouse/data_aaa/dt=20160726/delta_72070856_72070955
> >
> > Full log here.
> >
> > What could go wrong?
> >
> 
> 
> 


Mime
View raw message