crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Micah Whitacre (JIRA)" <>
Subject [jira] [Updated] (CRUNCH-609) KafkaSource could skip data with slow broker response
Date Tue, 05 Jul 2016 21:06:11 GMT


Micah Whitacre updated CRUNCH-609:
    Attachment: CRUNCH-609d.patch

Ok last patch I promise.  Fixes a bug in the polltime backoff.  Specifically last patch multiplied
the backoff too many times.  Talking with colleagues the backoff isn't really necessary b/c
the Kafka Consumer will be already retrieving in a background thread so should eventually
return regardless of how long sleeping.  So simplified that part of the code.

> KafkaSource could skip data with slow broker response
> -----------------------------------------------------
>                 Key: CRUNCH-609
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>          Components: IO
>            Reporter: Micah Whitacre
>            Assignee: Micah Whitacre
>         Attachments: CRUNCH-609.patch, CRUNCH-609b.patch, CRUNCH-609c.patch, CRUNCH-609d.patch
> In the case of the consumer.poll timing out, the KafkaRecordReader could return with
no values in the records iterator which then causes the KafkaRecordReader to exit prematurely[1].
 To fix this we need to probably better track the start/end offset and that progress vs the
null value.
> Additionally if we wanted to be smarter we could do some "backoff" for the timeout or
just have consumer specify a larger value.
> [1] -

This message was sent by Atlassian JIRA

View raw message