accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Coetzee <pe...@coetzee.org>
Subject Re: New research using Accumulo: Unified Secure On-/Off-line Analytics
Date Tue, 21 Oct 2014 14:37:38 GMT
There's the skeleton of a website at http://go.warwick.ac.uk/crucible,
although at present it's a thin pointer to the journal paper.

I'm currently working on getting the code up to release standard and pushed
through my employer's release process before I can put it up, and hopefully
I'll get some more complete documentation and examples together to go with
that. I'm sure you're familiar with the juggling act of splitting time
between getting research to a technical level that's increasingly useful vs
documenting and thus making it usable for others!


Peter.

On 21 October 2014 15:27, David Medinets <david.medinets@gmail.com> wrote:

> Thanks for letting us know about this research. Is there a website
> exploring the DSL?
>
> On Mon, Oct 20, 2014 at 4:00 AM, Peter Coetzee <peter@coetzee.org> wrote:
> > New open-access research published in the journal of Parallel Computing
> > demonstrates a novel approach to engineering analytics for deployment in
> > streaming and batch contexts.
> >
> > Increasing numbers of users are extracting real value from their data
> using
> > tools like IBM InfoSphere Streams for near-real-time analysis and Apache
> > Spark across their historical data in Accumulo.
> >
> > Until now, there hasn't been an approach which permits the use of these
> > tools from a single shared codebase, with deployment considerations
> reserved
> > until deployment time. Furthermore, it has been even harder to permit
> this
> > unified analysis while maintaining cell-level traces of the security
> > heritage for each datum an analytic produces.
> >
> > Some highlights of the paper include:
> >   - A domain specific language (CRUCIBLE) and runtime models for on- and
> > off-line data analytics.
> >   - Detailed analysis of CRUCIBLE’s runtime performance in
> state-of-the-art
> > environments.
> >   - Development and detailed analysis of a set of runtime models for new
> > environments.
> >   - Performance comparison with native implementations and discussion of
> > optimisation steps.
> >   - Formulation of a primitive in the DSL that permits an analytic to be
> run
> > over multiple data sources.
> >
> > The paper, Towards Unified Secure On- and Off-line Analytics at Scale, is
> > available free of charge from Elsevier:
> >
> > http://www.sciencedirect.com/science/article/pii/S0167819114000842
> >
> >
> > I am one of the lead authors of the work, and would be more than happy to
> > discuss any aspects which catch your attention!
> >
> > Peter
> >
> > --
> > Peter Coetzee
> > Performance Computing and Visualisation PhD Candidate
> > Department of Computer Science
> > University of Warwick
>

Mime
View raw message