drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@dremio.com>
Subject Re: Drill Hangout (2015-08-04) minutes
Date Wed, 05 Aug 2015 02:18:22 GMT
Quick thought on insert isolation:

Let's just do a hidden directory and then rename.  We can make Drill avoid
reading hidden directories.  No fancy work required.

With regards to Dot Drill, let's not turn this into a mini database.  The
complexities would be overwhelming.  My recommendation is we constrain to
additional metadata that cannot otherwise be divined.  Beyond that, we're
should use ephemeral files (similar to the parquet metadata cache where
deleting doesn't impact logical outcome, may impact planning or
performance).  I would avoid mixing ephemeral and persistent data around
the dot drill concept.

In general, if we want to store Drill's internal ephemeral metadata, lets
have a discussion around the options.  Also remember that not all Drill
installations will use a distributed filesystem.  As such, we need to think
about these abstractions to support multiple types of storage systems.

Jacques Nadeau
CTO and Co-Founder, Dremio

On Tue, Aug 4, 2015 at 2:24 PM, Khurram Faraaz <kfaraaz@maprtech.com> wrote:

> Drill Hangout 2015-08-04
> Attendees: Daniel, Khurram, Neeraja, Vicky, Kris, Aman, Parth, Andries,
> Jinfeng, Anas
> - Insert and drop
> - read isolation during insert into, Aman suggested snapshot level
> - have to have some kind of lock manager
> - locking on the dot drill file
> - should this locking talk to other external programs working with Drill
> used by Drill?
>     - Jason - why is this the lock necessary?
>     - we want to merge schemas in a dot drill file, avoid gather schemas
> from a                            lot of separate files
> - insert feature will be broken into phases
>     - this needs to handle schema changes to be consistent with the rest of
> Drill
> - partition pruning is not working for some expressions
> - we will only fix for cast
> - Jinfeng thinks this should be easy enough
> - handling unknown types in parquet or other external systems
> - should we fail actively, or should we give data back in varbinary
> - sould people have to wait for a release to handle new data types
> - storage plugin writers should have a clear idea about how to handle these
> cases
> - Jason will send a message to the list about this
> - test framework
> - Rahul is working on publishing it to a public repository
> - this will include instructions on how to set up the tests on your own
> hardware

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message