accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Re: New research using Accumulo: Unified Secure On-/Off-line Analytics
Date Tue, 21 Oct 2014 14:54:09 GMT
The picture of the DSL Eclipse Integration looks nice. Looking forward
to seeing the Iterator stack and cell-level security handling.

On Tue, Oct 21, 2014 at 10:37 AM, Peter Coetzee <peter@coetzee.org> wrote:
> There's the skeleton of a website at http://go.warwick.ac.uk/crucible,
> although at present it's a thin pointer to the journal paper.
>
> I'm currently working on getting the code up to release standard and pushed
> through my employer's release process before I can put it up, and hopefully
> I'll get some more complete documentation and examples together to go with
> that. I'm sure you're familiar with the juggling act of splitting time
> between getting research to a technical level that's increasingly useful vs
> documenting and thus making it usable for others!
>
>
> Peter.
>
> On 21 October 2014 15:27, David Medinets <david.medinets@gmail.com> wrote:
>>
>> Thanks for letting us know about this research. Is there a website
>> exploring the DSL?
>>
>> On Mon, Oct 20, 2014 at 4:00 AM, Peter Coetzee <peter@coetzee.org> wrote:
>> > New open-access research published in the journal of Parallel Computing
>> > demonstrates a novel approach to engineering analytics for deployment in
>> > streaming and batch contexts.
>> >
>> > Increasing numbers of users are extracting real value from their data
>> > using
>> > tools like IBM InfoSphere Streams for near-real-time analysis and Apache
>> > Spark across their historical data in Accumulo.
>> >
>> > Until now, there hasn't been an approach which permits the use of these
>> > tools from a single shared codebase, with deployment considerations
>> > reserved
>> > until deployment time. Furthermore, it has been even harder to permit
>> > this
>> > unified analysis while maintaining cell-level traces of the security
>> > heritage for each datum an analytic produces.
>> >
>> > Some highlights of the paper include:
>> >   - A domain specific language (CRUCIBLE) and runtime models for on- and
>> > off-line data analytics.
>> >   - Detailed analysis of CRUCIBLE’s runtime performance in
>> > state-of-the-art
>> > environments.
>> >   - Development and detailed analysis of a set of runtime models for new
>> > environments.
>> >   - Performance comparison with native implementations and discussion of
>> > optimisation steps.
>> >   - Formulation of a primitive in the DSL that permits an analytic to be
>> > run
>> > over multiple data sources.
>> >
>> > The paper, Towards Unified Secure On- and Off-line Analytics at Scale,
>> > is
>> > available free of charge from Elsevier:
>> >
>> > http://www.sciencedirect.com/science/article/pii/S0167819114000842
>> >
>> >
>> > I am one of the lead authors of the work, and would be more than happy
>> > to
>> > discuss any aspects which catch your attention!
>> >
>> > Peter
>> >
>> > --
>> > Peter Coetzee
>> > Performance Computing and Visualisation PhD Candidate
>> > Department of Computer Science
>> > University of Warwick
>
>

Mime
View raw message