kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andras Nagy <andras.istvan.n...@gmail.com>
Subject Question about handling late arrival in streaming ingestion
Date Wed, 22 May 2019 14:11:52 GMT
Dear All,

I have a question about the handling of events that arrive significantly
later than the logical event timestamp, in streaming ingestion.

In the blog post from 2016 at
http://kylin.apache.org/blog/2016/10/18/new-nrt-streaming/ , I read this:
"To let the late/early message can be queried, Cube segments allow overlap
for the partition time dimension: each segment has a “min” date/time and a
“max” date/time; Kylin will scan all segments which matched with the
queried time scope. Figure 2 illurates this. ..."

On the other hand, I found a ticket:
https://issues.apache.org/jira/browse/KYLIN-1210 titled "Allowing segment
overlap to solve streaming data completeness problem" which seems to be
about the same issue, but its status is Open/unresolved.

There is also another ticket:
https://issues.apache.org/jira/browse/KYLIN-1744 titled "Separate concepts
of source offset and date range on cube segments", which seems to be
related again. This one is Closed/Fixed in 1.5.3.

Can you please help to clarify this, what is the status of this capability?
What is the best practice currently to handle late arrival of events with
Kylin?

Many thanks,
Andras

Mime
View raw message