hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <>
Subject [jira] [Commented] (HIVE-11444) ACID Compactor should generate stats/alerts
Date Tue, 04 Apr 2017 22:51:41 GMT


Eugene Koifman commented on HIVE-11444:

More generally, raise alert 
1. if there are too many open txns
2. if there are too many aborted txns - most likely a misconfigured streaming ingest client.
 Need to include client info in the alert.
3. if there are a lot of entries in TXN_COMPONENTS  - means compactor is not keeping up

In extreme cases both can cause the amount of metadata to slow down the metastore operations
(TxnHandler/CompactionTxnHandler) a use very large amounts of RAM (ValidTxnList)

> ACID Compactor should generate stats/alerts
> -------------------------------------------
>                 Key: HIVE-11444
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
> Compaction should generate stats about number of files it reads, min/max/avg size etc.
 It should also generate alerts if it looks like the system is not configured correctly.
> For example, if there are lots of delta files with very small files, it's a good sign
that Streaming API is configured with batches that are too small.
> Simplest idea is to add another periodic task to AcidHouseKeeperService to
>         //periodically do select count(*), min(txnid),max(txnid), type from txns group
by type.
>         //1. dump that to log file at info
>         //2. could also keep counts for last 10min, hour, 6 hours, 24 hours, etc
>         //2.2 if a large increase is detected - issue alert (at least to the log for
now) at warn/error
> Should also alert if there is ACID activity but no compactions running.
> One way to do this is to add logic to TxnHandler to periodically check contents of COMPACTION_QUEUE
table and keep  a simple histogram of compactions over last few hours.
> Similarly can run a periodic check of transactions started (or committed/aborted) and
keep a simple histogram.  Then the 2 can be used to detect that there is ACID write activity
but no compaction activity.

This message was sent by Atlassian JIRA

View raw message