flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Multiple windows with large number of partitions
Date Tue, 03 May 2016 14:06:33 GMT
Yes, please go ahead. That would be helpful.

On Mon, 2 May 2016 at 21:56 Christopher Santiago <chris@ninjametrics.com>
wrote:

> Hi Aljoscha,
>
> Yes, there is still a high partition/window count since I have to keyby
> the userid so that I get unique users.  I believe what I see happening is
> that the second window with the timeWindowAll is not getting all the
> results or the results from the previous window are changing when the
> second window is running.  I can see the date/unique user count increase
> and decrease as it is running for a particular day.
>
> I can share the eclipse project and the sample data file I am working off
> of with you if that would be helpful.
>
> Thanks,
> Chris
>
> On Mon, May 2, 2016 at 12:55 AM, Aljoscha Krettek [via Apache Flink User
> Mailing List archive.] <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=6626&i=0>> wrote:
>
>> Hi,
>> what do you mean by "still experiencing the same issues"? Is the key
>> count still very hight, i.e. 500k windows?
>>
>> For the watermark generation, specifying a lag of 2 days is very
>> conservative. If the watermark is this conservative I guess there will
>> never arrive elements that are behind the watermark, thus you wouldn't need
>> the late-element handling in your triggers. The late-element handling in
>> Triggers is only required to compensate for the fact that the watermark can
>> be a heuristic and not always correct.
>>
>> Cheers,
>> Aljoscha
>>
>> On Thu, 28 Apr 2016 at 21:24 Christopher Santiago <[hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=6601&i=0>> wrote:
>>
>>> Hi Aljoscha,
>>>
>>>
>>> Aljoscha Krettek wrote
>>> >>is there are reason for keying on both the "date only" field and the
>>> "userid". I think you should be fine by just specifying that you want
>>> 1-day
>>> windows on your timestamps.
>>>
>>> My mistake, this was from earlier tests that I had performed.  I removed
>>> it
>>> and went to keyBy(2) and I am still experiencing the same issues.
>>>
>>>
>>> Aljoscha Krettek wrote
>>> >>Also, do you have a timestamp extractor in place that takes the
>>> timestamp
>>> from your data and sets it as the internal timestamp field.
>>>
>>> Yes there is, it is from the BoundedOutOfOrdernessGenerator example:
>>>
>>>     public static class BoundedOutOfOrdernessGenerator implements
>>> AssignerWithPeriodicWatermarks<Tuple3&lt;DateTime, String, String>>
{
>>>         private static final long serialVersionUID = 1L;
>>>         private final long maxOutOfOrderness =
>>> Time.days(2).toMilliseconds();
>>>         private long currentMaxTimestamp;
>>>
>>>         @Override
>>>         public long extractTimestamp(Tuple3<DateTime, String, String>
>>> element, long previousElementTimestamp) {
>>>             long timestamp = element.f0.getMillis();
>>>             currentMaxTimestamp = Math.max(timestamp,
>>> currentMaxTimestamp);
>>>             return timestamp;
>>>         }
>>>
>>>         @Override
>>>         public Watermark getCurrentWatermark() {
>>>             return new Watermark(currentMaxTimestamp -
>>> maxOutOfOrderness);
>>>         }
>>>     }
>>>
>>> Thanks,
>>> Chris
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Multiple-windows-with-large-number-of-partitions-tp6521p6562.html
>>> Sent from the Apache Flink User Mailing List archive. mailing list
>>> archive at Nabble.com.
>>>
>>
>>
>> ------------------------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Multiple-windows-with-large-number-of-partitions-tp6521p6601.html
>> To unsubscribe from Multiple windows with large number of partitions, click
>> here.
>> NAML
>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
> ------------------------------
> View this message in context: Re: Multiple windows with large number of
> partitions
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Multiple-windows-with-large-number-of-partitions-tp6521p6626.html>
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> at
> Nabble.com.
>

Mime
View raw message