nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Turcsanyi (Jira)" <>
Subject [jira] [Updated] (NIFI-7740) Add Records Per Transaction and Transactions Per Batch to PutHive3Streaming
Date Tue, 01 Sep 2020 15:32:00 GMT


Peter Turcsanyi updated NIFI-7740:
    Fix Version/s: 1.13.0
       Resolution: Fixed
           Status: Resolved  (was: Patch Available)

> Add Records Per Transaction and Transactions Per Batch to PutHive3Streaming
> ---------------------------------------------------------------------------
>                 Key: NIFI-7740
>                 URL:
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Matt Burgess
>            Assignee: Matt Burgess
>            Priority: Major
>             Fix For: 1.13.0
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
> The original PutHiveStreaming (for Hive 1.2.x) exposed properties to the user for tuning
the number of records in an individual Hive Streaming transaction, as well as the number of
transactions to be batched together (for performance).
> These properties should be exposed in the PutHive3Streaming processor in order to tune
its performance. The default values should result in the current behavior, so a setting of
zero for Records Per Transaction will put all records into a single transaction, and a setting
of 1 for Transactions Per Batch will result in a single transaction in each batch. Together
these defaults describe the current behavior.
> For large files, Records Per Transaction should be set to something more manageable,
such as 100K perhaps, and Transactions Per Batch to something such as 10. As a rule the product
of the two numbers should be larger than the largest expected number of records in the flow
file(s), this will ensure any failed transaction batches cause a full rollback. The documentation
for these properties should include this prescription.

This message was sent by Atlassian Jira

View raw message