accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Apache Accumulo integrated with Presto
Date Mon, 13 Jun 2016 17:03:54 GMT
Interesting! Yeah, I'm not sure anymore (assuming I even knew at one 
time). I think they've moved away from it entirely and the code wasn't 
open-sourced, so it may just be lost to the ages :)

Either way, thanks for the extra details!

Adam J. Shook wrote:
> Maybe.  I'd be interested in what they'd done to get the big gains.  My
> approach was, for a full table scan (vs. using the index), the connector
> creates one Presto split for each tablet.  The Accumulo metadata table is
> scanned to get the tablet location, and the host hint for the Presto split
> is set to where the tablet is located.  Using this approach, the query
> times were nearly identical whether the workers were co-located with tablet
> servers or not.
>
> On Mon, Jun 13, 2016 at 11:55 AM, Josh Elser<josh.elser@gmail.com>  wrote:
>
>>
>> Adam J. Shook wrote:
>>
>>> A few clarifications:
>>>
>>> - Presto supports hash-based distributed joins as well as broadcast joins
>>>
>>> - Presto metadata is stored in ZooKeeper, but metadata storage is
>>> pluggable
>>> and could be stored in Accumulo instead
>>>
>>> - The connector does use tablet locality when scanning Accumulo, but our
>>> testing has shown you get better performance by giving Accumulo and Presto
>>> their own dedicated machines, making locality a moot point.  This will
>>> certainly change based on types of queries, data sizes, network quality,
>>> etc.
>>>
>> I'm a little confused by this. Co-locating presto workers and tservers
>> doesn't necessarily mean that you're going to get local reads/writes at the
>> Accumulo layer. I remember this is something that the Argyle Data folks had
>> found was a big gain for them when they were doing Presto+Accumulo. Maybe
>> your findings were more based on your specific circumstances?
>>
>>
>> - You can insert the results of a query into a Presto table using INSERT
>>> INTO foo SELECT ..., as well as create a table from the results of a query
>>> (CTAS).  Though, for large inserts, it is typically best to bypass the
>>> Presto layer and insert directly into the Accumulo tables using the
>>> PrestoBatchWriter API
>>>
>>> Cheers,
>>> --Adam
>>>
>>> On Mon, Jun 13, 2016 at 7:20 AM, Christopher<ctubbsii@apache.org>   wrote:
>>>
>>> Thanks for that summary, Dylan! Very helpful.
>>>> On Mon, Jun 13, 2016, 01:36 Dylan Hutchison<dhutchis@cs.washington.edu>
>>>> wrote:
>>>>
>>>> Thanks for sharing Sean.  Here are some notes I wrote after reading the
>>>>> article on Presto-Accumulo design.  I have a research interest in the
>>>>> relationship between relational (SQL) and non-relational (Accumulo)
>>>>> systems, so I couldn't resist reading the post in detail.
>>>>>
>>>>>      - Places the primary key in the Accumulo row.
>>>>>      - Performs row-at-a-time processing (each tuple is one row in
>>>>>      Accumulo) using WholeRowIterator behavior.
>>>>>      - Relational table metadata is stored in the Presto infrastructure
>>>>> (as
>>>>>      opposed to an Accumulo table).
>>>>>      - Supports the creation of index tables for any attributes. These
>>>>>      index tables speed up queries that filter on indexed attributes.
 It
>>>>>
>>>> is
>>>>
>>>>>      standard secondary indexing, which provides speedups when the
>>>>>
>>>> selectivity
>>>>
>>>>>      of the query is roughly<10% of the original table.
>>>>>      - Only database->client querying is supported.  You cannot run
>>>>> "select
>>>>>      ... into result_table".
>>>>>      - As far as I can see, Presto only has one join strategy: *broadcast
>>>>>      join*.  The right table of every join is scanned into one of the
>>>>>      Presto worker's memory.  Subsequently the size of the right table
is
>>>>>      limited by worker memory.
>>>>>      - There is one Presto worker for each Accumulo tablet, which enables
>>>>>      good scaling.
>>>>>      - The Presto bridge classes track internal Accumulo information
such
>>>>>      as the assignment of tablets to tablet servers by reading Accumulo's
>>>>>      Metadata table. Presto uses tablet locations to provide better
>>>>>
>>>> locality.
>>>>
>>>>>      - The Presto bridge comes with several Accumulo server-side
>>>>> iterators
>>>>>      for filtering and aggregating.
>>>>>      - The code is quite nice and clean.
>>>>>
>>>>> This image below gives Presto's architecture.  Accumulo takes the role
>>>>> of
>>>>> the DB icon in the bottom-right corner.
>>>>>
>>>>> [image: Inline image 2]
>>>>>
>>>>> Bloomberg ran 13 out of the 22 TPC-H queries.  There is no fundamental
>>>>> reason why they cannot run all the queries; they just have not
>>>>>
>>>> implemented
>>>>
>>>>> everything required ('exists' clauses, non-equi join, etc.).
>>>>>
>>>>> The interface looks like this, though they use a compiled java jar to
>>>>> insert entries from a csv file (it wraps around a BatchWriter).
>>>>>
>>>>> [image: Inline image 3]
>>>>>
>>>>> Here are performance results.  They don't say what hardware or data
>>>>> sizes
>>>>> they use.  Whatever it is, they must have the ability to fit the smaller
>>>>> table of any join into memory as a result of Presto's broadcast join
>>>>> strategy.  The strong scaling looks very nice.
>>>>>
>>>>> [image: Inline image 4]
>>>>>
>>>>> They have one other plot that shows how secondary indexing speeds up
>>>>> some
>>>>> queries with low selectivity.
>>>>>
>>>>> Cheers, Dylan
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Jun 12, 2016 at 7:06 PM, Sean Busbey<busbey@cloudera.com>
>>>>>
>>>> wrote:
>>>>
>>>>> Bloomberg have a post about a connector they made to query Accumulo from
>>>>>> Presto:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>> http://www.bloomberg.com/company/announcements/open-source-at-bloomberg-reducing-application-development-time-via-presto-accumulo/
>>>>
>>>>> --
>>>>>> Sean Busbey
>>>>>>
>>>>>>
>

Mime
View raw message