drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5432) Want a memory format for PCAP files
Date Sat, 01 Jul 2017 02:19:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070954#comment-16070954

ASF GitHub Bot commented on DRILL-5432:

Github user tdunning commented on the issue:

    On Fri, Jun 30, 2017 at 5:34 PM, Paul Rogers <notifications@github.com>
    > That only works if Drill has an autoloading capability that allows storage
    > formats to be loaded and authenticated easily.
    > Why? To get pcap in 1.11, one must install a new Drill version, which
    > requires a restart. Assuming that pcap were a separate project, you'd
    > install a new jar, and restart. Neither provides auto loading.
    No. But if I have 1.11 already and decide that I want pcap (or any similar
    data format), it would be nice if I could do the equivalent of pip (for
    python) or install.packages("...") (for R) or mvn test (for Java) and get
    whatever cool capability I like.
    This may be described as "just install a new jar" but the convenience level
    is a proven Big Deal (tm). It would even be possible to leverage Maven
    central to make it happen, but there needs to be convenience sugar around
    the process to make it consumable.
    > Not clear on the authentication issue. If a plugin were a separate
    > project, wouldn't that project have a way of certifying its jars?
    Dunno. Not some random project.
    If Drill requires it, then yes.  And I am saying that Drill should require
    resolution back to some level of trust.

> Want a memory format for PCAP files
> -----------------------------------
>                 Key: DRILL-5432
>                 URL: https://issues.apache.org/jira/browse/DRILL-5432
>             Project: Apache Drill
>          Issue Type: New Feature
>            Reporter: Ted Dunning
> PCAP files [1] are the de facto standard for storing network capture data. In security
and protocol applications, it is very common to want to extract particular packets from a
capture for further analysis.
> At a first level, it is desirable to query and filter by source and destination IP and
port or by protocol. Beyond that, however, it would be very useful to be able to group packets
by TCP session and eventually to look at packet contents. For now, however, the most critical
requirement is that we should be able to scan captures at very high speed.
> I previously wrote a (kind of working) proof of concept for a PCAP decoder that did lazy
deserialization and could traverse hundreds of MB of PCAP data per second per core. This compares
to roughly 2-3 MB/s for widely available Apache-compatible open source PCAP decoders.
> This JIRA covers the integration and extension of that proof of concept as a Drill file
> Initial work is available at https://github.com/mapr-demos/drill-pcap-format
> [1] https://en.wikipedia.org/wiki/Pcap

This message was sent by Atlassian JIRA

View raw message