beam-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Groh <tg...@google.com>
Subject Re: KafkaIO Questions
Date Sat, 11 Jun 2016 00:46:15 GMT
If we're reading from an unbounded read and it reports the watermark as
BoundedWindow#TIMESTAMP_MAX_VALUE, the InProcessRunner won't reinvoke the
source; the call to start() returning false by itself just means that we
should call into it later, but the output watermark should still be held by
the source.

On Fri, Jun 10, 2016 at 4:44 PM, Raghu Angadi <rangadi@google.com> wrote:

> It looks like InProcessPipelineRunner instantiates the source, calls
> start() on it, and immediately closes it. In this case start() returns
> false and the runner seems to think the source is done (which is incorrect?)
>
> On Fri, Jun 10, 2016 at 4:24 PM, Jesse Anderson <jesse@smokinghand.com>
> wrote:
>
>> Raghu and I spent some time on a hangout looking at this issue. Looks
>> like there is an issue with unbounded collections with KafkaIO
>> on InProcessPipelineRunner.
>>
>> We changed the code to be a bounded collection with withMaxNumRecords and
>> used DirectPipelineRunner. That worked and processed the messages.
>>
>> Next, we used InProcessPipelineRunner with a bounded collection. That
>> worked and processed the messages.
>>
>> We changed it back to an unbounded collection
>> using InProcessPipelineRunner. That didn't work and continued to output the
>> error messages similar to the ones I've shown on the thread.
>>
>> Thanks,
>>
>> Jesse
>>
>>
>> On Wed, Jun 8, 2016 at 7:12 PM Jesse Anderson <jesse@smokinghand.com>
>> wrote:
>>
>>> I tried an 0.9.0 broker and I got the same error. Not sure if it makes a
>>> difference, but I'm using Confluent platform 2.0 and 3.0 for this testing.
>>>
>>> On Wed, Jun 8, 2016 at 5:20 PM Jesse Anderson <jesse@smokinghand.com>
>>> wrote:
>>>
>>>> Still open to screensharing and resolving over a hangout.
>>>>
>>>> On Wed, Jun 8, 2016 at 5:19 PM Raghu Angadi <rangadi@google.com> wrote:
>>>>
>>>>> On Wed, Jun 8, 2016 at 1:56 PM, Jesse Anderson <jesse@smokinghand.com>
>>>>> wrote:
>>>>>
>>>>>> [pool-2-thread-1] INFO org.apache.beam.sdk.io.kafka.KafkaIO -
>>>>>> Reader-0: resuming eventsim-0 at default offset
>>>>>>
>>>>> [...]
>>>>>>
>>>>> [pool-2-thread-1] INFO org.apache.kafka.common.utils.AppInfoParser -
>>>>>> Kafka commitId : 23c69d62a0cabf06
>>>>>> [pool-2-thread-1] WARN org.apache.beam.sdk.io.kafka.KafkaIO -
>>>>>> Reader-0: getWatermark() : no records have been read yet.
>>>>>> [pool-77-thread-1] INFO org.apache.beam.sdk.io.kafka.KafkaIO -
>>>>>> Reader-0: Returning from consumer pool loop
>>>>>> [pool-78-thread-1] WARN org.apache.beam.sdk.io.kafka.KafkaIO -
>>>>>> Reader-0: exception while fetching latest offsets. ignored.
>>>>>>
>>>>>
>>>>> this reader is closed before the exception. The exception is due to an
>>>>> action during close and can be ignored. The main question is why this
was
>>>>> closed...
>>>>>
>>>>
>

Mime
View raw message