accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kepner, Jeremy - 0553 - MITLL" <kep...@ll.mit.edu>
Subject Re: Common Big Data Architecture Writeup
Date Tue, 29 Apr 2014 01:12:14 GMT
D4M is two things:

(1) A set of software for doing analytics.
(2) A schema for ingesting and indexing diverse data into a NoSQL database like Accumulo

It hits two parts of the Common Big Data Architecture.
The CBDA is merely a restating of the obvious components a system needs to effective at processing
Big Data.
It can be implemented with a variety of technologies.

Regards.  -Jeremy


On Apr 28, 2014, at 9:08 PM, Chris Bennight <chris@slowcar.net>
 wrote:

> I'm not getting what exactly the "Common Big Data Architecture" is?   
> 
> Is it just a term that describes any system that has the 7 components Jeremy mentioned
(fs, ingest, DB, analytics, web services, resource scheduler, elastic compute)?   If so, what's
the significance of naming this collection?
> 
> And how exactly is D4M related to this?   (I understand it (D4M) hits a subset of those
features, but don't think it encompasses all of those)
> 
> Apologies if these are obtuse questions, I just feel like I"m not comprehending what
information is trying to be conveyed?
> 
> 
> 
> 
> On Mon, Apr 28, 2014 at 8:51 PM, Jeremy Kepner <kepner@ll.mit.edu> wrote:
> No problem.  I am glad to start getting the definitions out there.
> Great work on the page.  I think helps clarify things a lot.
> 
> On Mon, Apr 28, 2014 at 08:45:47PM -0400, David Medinets wrote:
> >    Sorry for my misunderstanding. I've updated the github project and moved
> >    it to [1]https://github.com/medined/D4M_Schema.
> >
> >    On Mon, Apr 28, 2014 at 5:36 PM, Jeremy Kepner <[2]kepner@ll.mit.edu>
> >    wrote:
> >
> >      David's well written example is illustrating the D4M Schema
> >      ([3]http://ieee-hpec.org/2013/index_htm_files/11-Kepner-D4Mschema-IEEE-HPEC.pdf).
> >
> >      The Common Big Data Architecture is a broad description that encompasses
> >      many
> >      big data systems and consists of 7 components: filesystem, ingest
> >      processes,
> >      databases, analytic processes, web services, resource scheduler, and
> >      elastic computing.  A reference will most likely appear in IEEE HPEC
> >      2014.
> >
> >      Accumulo is the database of choice in many CBDA systems.
> >
> >      The D4M schema is used in many Accumulo systems.
> >
> >      On Mon, Apr 28, 2014 at 05:23:00PM -0400, David Medinets wrote:
> >      >    [1][4]https://github.com/medined/Common-Big-Data-Architecture -
> >      This project
> >      >    provides simple examples of the CBDA which is used by the D4M 2.0
> >      >    software.
> >      >
> >      > References
> >      >
> >      >    Visible links
> >      >    1. [5]https://github.com/medined/Common-Big-Data-Architecture
> >
> > References
> >
> >    Visible links
> >    1. https://github.com/medined/D4M_Schema
> >    2. mailto:kepner@ll.mit.edu
> >    3. http://ieee-hpec.org/2013/index_htm_files/11-Kepner-D4Mschema-IEEE-HPEC.pdf
> >    4. https://github.com/medined/Common-Big-Data-Architecture
> >    5. https://github.com/medined/Common-Big-Data-Architecture
> 


Mime
View raw message