drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venki Korukanti (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-2080) Add IO wait time stats for Parquet and Json input files
Date Tue, 27 Jan 2015 04:00:39 GMT
Venki Korukanti created DRILL-2080:

             Summary: Add IO wait time stats for Parquet and Json input files
                 Key: DRILL-2080
                 URL: https://issues.apache.org/jira/browse/DRILL-2080
             Project: Apache Drill
          Issue Type: Improvement
          Components: Execution - Operators
            Reporter: Venki Korukanti
            Assignee: Venki Korukanti

Currently time spent in IO read time is included as part of the total processing time. This
JIRA is to measure the IO read time separately and add it to OperatorStats. 

Implementation details:
Add a FileSystem implementation called DrillFileSystem which takes an existing FileSystem
instance and OperatorStats. Whenever a file is opened using DrillFileSystem, it returns an
instance of DrillFSDataInputStream which is a facade to actual FSDataInputStream. DrillFSDataInputStream
adds the IO read time stats whenever a read request is issued. 

IO Stats work only when DrillFileSystem is used. This patch modified JSON and Parquet readers
to use DrillFileSystem. Text reader is not included.

This message was sent by Atlassian JIRA

View raw message