hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
Date Thu, 15 Oct 2015 20:47:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959587#comment-14959587

Sangjin Lee commented on YARN-4179:

Yes, I know that java's SimpleDateFormat is not thread safe. There are ways to handle this
easily without incurring synchronization overhead however. One pattern is

static ThreadLocal<DateFormat> DATE_FORMAT = new ThreadLocal<>() {
  @Override protected DateFormat initialValue() {
    return new SimpleDateFormat(...);

When Date is converted to JSON, it is represented as a long. Hence when JSON parsing is done
at the client side, getInfo().get(DATE_INFO_KEY) returns a long. That is why the conversion.

Got it. Thanks for the clarification.

And I have chosen a single query param daterange(delimited by "-") i.e. a specific date or
a range. 
If we want to specify a startdate and enddate we will need 2 query params. If startdate is
not specified, every date starting from 1970 till enddate can be taken(constrained by limit)
and if enddate isnt specified every date from startdate till today can be taken. Do you want
this approach ?

So "20151001-20151031" would return all records between 10/1 and 10/31 (both inclusive), right?
And "20151001" would return records only for that date. Is either "20151001-" or "-20151001"
legal? If so, what would they do? The same as "20151001" (I suspect)?

I am fine with that approach, but I think there is a little bit of additional value in interpreting
"20151001-" to mean "20151001-(now)", and similarly "-20151001" to mean "(ages ago)-20151001"
(I wasn't suggesting having 2 query params; just a different interpretation of those values).
We can discuss whether it's worth supporting them.

IMO open-ended queries do not make things worse. Note that the limit is always used (even
if the user did not provide one). Users can even query without any date range which is open-ended
on both sides. The limit is what makes the queries sane. The date range queries would always
be more constraining queries than those without.

> [reader implementation] support flow activity queries based on time
> -------------------------------------------------------------------
>                 Key: YARN-4179
>                 URL: https://issues.apache.org/jira/browse/YARN-4179
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Varun Saxena
>            Priority: Minor
>         Attachments: YARN-4179-YARN-2928.01.patch
> This came up as part of YARN-4074 and YARN-4075.
> Currently the only query pattern that's supported on the flow activity table is by cluster
only. But it might be useful to support queries by cluster and certain date or dates.

This message was sent by Atlassian JIRA

View raw message