drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3423) Add New HTTPD format plugin
Date Fri, 31 Jul 2015 16:01:05 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649404#comment-14649404

Jacques Nadeau commented on DRILL-3423:

Q1:  I should provide better comments in the code.  Vector memory allocations work on powers
of 2.  VarChar uses n+1 slots when allocating data.  As such, if we make batches 4095 in size,
then varchar allocations will be 4096 in size and we will have minimal wastage due to power
2 rounding.  If we chose 4096, then varchar allocations would be 4097 and thus the underlying
memory allocation would be 8192 with virtually half of that wasted.

Q2: My plan was actually to write a blog post around this plugin so people could use it as
a model.  (One of the reasons I actually kept in a single file.)  I wanted to get something
up for feedback but will be working on adding javadocs to clarify things.

Q3: Good point.  We should implement a new FormatMatcher for access logs that recognizes this
pattern.  Can you provide a couple of examples and maybe propose a format matching algorithm?

> Add New HTTPD format plugin
> ---------------------------
>                 Key: DRILL-3423
>                 URL: https://issues.apache.org/jira/browse/DRILL-3423
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Storage - Other
>            Reporter: Jacques Nadeau
>            Assignee: Jacques Nadeau
>             Fix For: 1.2.0
> Add an HTTPD logparser based format plugin.  The author has been kind enough to move
the logparser project to be released under the Apache License.  Can find it here:
> <dependency>
>     <groupId>nl.basjes.parse.httpdlog</groupId>
>     <artifactId>httpdlog-parser</artifactId>
>     <version>2.0</version>
> </dependency>

This message was sent by Atlassian JIRA

View raw message