drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3572) Provide a simple interface to append metadata to files and directories
Date Tue, 28 Jul 2015 23:00:06 GMT
Jacques Nadeau created DRILL-3572:

             Summary: Provide a simple interface to append metadata to files and directories
                 Key: DRILL-3572
                 URL: https://issues.apache.org/jira/browse/DRILL-3572
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - Other
            Reporter: Jacques Nadeau
            Assignee: Jacques Nadeau
             Fix For: 1.3.0

We need a way to store small amounts of metadata about a file or a collection of files.  The
current thinking was a way to have a "dot drill file" that ascribes metadata to a particular

Initial example file might be something that includes the following:

  // Drill version identifier
  version: "dd1"  
  // Format Plugin Configuration
  format: {  
    type: "httpd", 
    format: "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cookie}i\""}
  // Traits of underlying data (a.k.a physical properties)
  traits: [ // traits of the underlying data
    {type: "sort_nulls_first", columns: ["request.uri", "client.host"]}
    {type: "unique", columns ["abc"]}
    {type: "unqiue", columns ["xy", "zz"]}
  // Mappings between directory names and exposed columns
  dirs: [
    {skip: true}, // don't include this directory name in the directory path.
    {name: "year", type: "integer"},
    {name: "month", type: "integer"},
    {name: "day", type: "integer"}
  // whether or not a user can add new columns to the table through insert
  rigid_table: true


We also need to support adding more machine-generated/managed data such as statistics.  That
should be done using a separate file from the one that is human description.

A user should be able to ascribe this metadata directly through the file system as well as
through sql commands such as 

This message was sent by Atlassian JIRA

View raw message