drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-7223) Make the timeout in TimedCallable a configurable boot time parameter
Date Thu, 02 May 2019 01:22:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831372#comment-16831372
] 

ASF GitHub Bot commented on DRILL-7223:
---------------------------------------

amansinha100 commented on pull request #1776: DRILL-7223: Create an option to control timeout
for REFRESH METADATA
URL: https://github.com/apache/drill/pull/1776#discussion_r280265323
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
 ##########
 @@ -354,6 +354,10 @@ private ExecConstants() {
       "enables statistics usage for varchar and decimal data types. Default is unset, i.e.
empty string. " +
       "Allowed values: 'true', 'false', '' (empty string)."), "true", "false", "");
 
+  public static final String PARQUET_REFRESH_TIMEOUT = "store.parquet.refresh_timeout_per_runnable_in_msec";
 
 Review comment:
   We should avoid the word 'refresh' here and other places for the timeout since this parameter
is intended for any timed runnable task, not just the ones initiated by the REFRESH command.
 For instance, in normal query planning without using metadata cache, the FooterGatherer also
creates multiple `TimedCallable` threads to read parquet footers directly. 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Make the timeout in TimedCallable a configurable boot time parameter
> --------------------------------------------------------------------
>
>                 Key: DRILL-7223
>                 URL: https://issues.apache.org/jira/browse/DRILL-7223
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.16.0
>            Reporter: Aman Sinha
>            Assignee: Boaz Ben-Zvi
>            Priority: Minor
>             Fix For: 1.17.0
>
>
> The [TimedCallable.TIMEOUT_PER_RUNNABLE_IN_MSECS|https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java#L52]
is currently an internal Drill constant defined as 15 secs. This has been there from day 1
of the introduction. Drill's TimedCallable implements the Java concurrency's Callable interface
to create timed threads. It is used by the REFRESH METADATA command which creates multiple
threads on the Foreman node to gather Parquet metadata to build the metadata cache.
> Depending on the load on the system or for very large scale number of parquet files (millions)
it is possible to exceed this timeout.  While the exact root cause of exceeding the timeout
is being investigated, it makes sense to make this timeout a configurable parameter to aid
with large scale testing. This JIRA is to make this a configurable bootstrapping option in
the drill-override.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message