accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Coetzee <pe...@coetzee.org>
Subject Re: New research using Accumulo: Unified Secure On-/Off-line Analytics
Date Thu, 30 Oct 2014 16:47:29 GMT
David, All,

I've managed to get together a version of the code I can publish. There's a
basic quickstart guide to getting it installed and running in Eclipse at
http://www2.warwick.ac.uk/fac/sci/dcs/people/research/csrmab/crucible/quickstart,
as well as links to the source and binaries. I've not had much opportunity
to test this outside of my own environment, so all the usual "untested
research code" caveats apply. Hopefully it'll be a useful illustration of
the approach and concepts put forward in the paper, at least!

All the best,
Peter


On 21 October 2014 15:54, David Medinets <david.medinets@gmail.com> wrote:

> The picture of the DSL Eclipse Integration looks nice. Looking forward
> to seeing the Iterator stack and cell-level security handling.
>
> On Tue, Oct 21, 2014 at 10:37 AM, Peter Coetzee <peter@coetzee.org> wrote:
> > There's the skeleton of a website at http://go.warwick.ac.uk/crucible,
> > although at present it's a thin pointer to the journal paper.
> >
> > I'm currently working on getting the code up to release standard and
> pushed
> > through my employer's release process before I can put it up, and
> hopefully
> > I'll get some more complete documentation and examples together to go
> with
> > that. I'm sure you're familiar with the juggling act of splitting time
> > between getting research to a technical level that's increasingly useful
> vs
> > documenting and thus making it usable for others!
> >
> >
> > Peter.
> >
> > On 21 October 2014 15:27, David Medinets <david.medinets@gmail.com>
> wrote:
> >>
> >> Thanks for letting us know about this research. Is there a website
> >> exploring the DSL?
> >>
> >> On Mon, Oct 20, 2014 at 4:00 AM, Peter Coetzee <peter@coetzee.org>
> wrote:
> >> > New open-access research published in the journal of Parallel
> Computing
> >> > demonstrates a novel approach to engineering analytics for deployment
> in
> >> > streaming and batch contexts.
> >> >
> >> > Increasing numbers of users are extracting real value from their data
> >> > using
> >> > tools like IBM InfoSphere Streams for near-real-time analysis and
> Apache
> >> > Spark across their historical data in Accumulo.
> >> >
> >> > Until now, there hasn't been an approach which permits the use of
> these
> >> > tools from a single shared codebase, with deployment considerations
> >> > reserved
> >> > until deployment time. Furthermore, it has been even harder to permit
> >> > this
> >> > unified analysis while maintaining cell-level traces of the security
> >> > heritage for each datum an analytic produces.
> >> >
> >> > Some highlights of the paper include:
> >> >   - A domain specific language (CRUCIBLE) and runtime models for on-
> and
> >> > off-line data analytics.
> >> >   - Detailed analysis of CRUCIBLE’s runtime performance in
> >> > state-of-the-art
> >> > environments.
> >> >   - Development and detailed analysis of a set of runtime models for
> new
> >> > environments.
> >> >   - Performance comparison with native implementations and discussion
> of
> >> > optimisation steps.
> >> >   - Formulation of a primitive in the DSL that permits an analytic to
> be
> >> > run
> >> > over multiple data sources.
> >> >
> >> > The paper, Towards Unified Secure On- and Off-line Analytics at Scale,
> >> > is
> >> > available free of charge from Elsevier:
> >> >
> >> > http://www.sciencedirect.com/science/article/pii/S0167819114000842
> >> >
> >> >
> >> > I am one of the lead authors of the work, and would be more than happy
> >> > to
> >> > discuss any aspects which catch your attention!
> >> >
> >> > Peter
> >> >
> >> > --
> >> > Peter Coetzee
> >> > Performance Computing and Visualisation PhD Candidate
> >> > Department of Computer Science
> >> > University of Warwick
> >
> >
>

Mime
View raw message