drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victoria Markman (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3827) Empty metadata file causes queries on the table to fail
Date Wed, 23 Sep 2015 14:55:05 GMT
Victoria Markman created DRILL-3827:
---------------------------------------

             Summary: Empty metadata file causes queries on the table to fail
                 Key: DRILL-3827
                 URL: https://issues.apache.org/jira/browse/DRILL-3827
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 1.2.0
            Reporter: Victoria Markman
            Assignee: Jinfeng Ni
            Priority: Critical


I ran into a situation where drill created an empty metadata file (which is a separate issue
and I will try to narrow it down. Suspicion is that this happens when "refresh table metada
x" fails with "permission denied" error).

However, we need to guard against situation where metadata file is empty or corrupted. We
probably should skip reading it if we encounter unexpected result and continue with query
planning without that information. In the same fashion as partition pruning failure. It's
also important to log this information somewhere, drillbit.log as a start. It would be really
nice to have a flag in the query profile that tells a user if we used metadata file for planning
or not. Will help in debugging performance issues.

Very confusing exception is thrown if you have zero length meta data file in the directory:
{code}
[Wed Sep 23 07:45:28] # ls -la
total 2
drwxr-xr-x  2 root root   2 Sep 10 14:55 .
drwxr-xr-x 16 root root  35 Sep 15 12:54 ..
-rwxr-xr-x  1 root root 483 Jul  1 11:29 0_0_0.parquet
-rwxr-xr-x  1 root root   0 Sep 10 14:55 .drill.parquet_metadata

0: jdbc:drill:schema=dfs> select * from t1;
Error: SYSTEM ERROR: JsonMappingException: No content to map due to end-of-input
 at [Source: com.mapr.fs.MapRFsDataInputStream@342bd88d; line: 1, column: 1]
[Error Id: c97574f6-b3e8-4183-8557-c30df6ca675f on atsqa4-133.qa.lab:31010] (state=,code=0)
{code}

Workaround is trivial, remove the file. Marking it as critical, since we don't have any concurrency
control in place and this file can get corrupted as well.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message