polygene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiri Jetmar <juergen.jet...@gmail.com>
Subject Re: Apache Zest / Apache Cassandra
Date Sun, 21 Feb 2016 10:48:34 GMT
Yes, the OLAP world with things like "star schema", ETL jobs, etc. is far
too heavyweight.  And therefore I see Apache Spark on the right direction,
providing easy access to data analysis, tools..

2016-02-21 1:54 GMT+01:00 Niclas Hedhman <niclas@hedhman.org>:

> On analytics; I have never enjoyed the OLAP world, and I take your word for
> it.
>
> Cheers
> Niclas
>
> On Sun, Feb 21, 2016 at 3:38 AM, Jiri Jetmar <juergen.jetmar@gmail.com>
> wrote:
>
> > Hi guys,
> >
> > what is the status of the Apache Cassandra Entity Store ? Somehow I can
> > remember that Cassandra was supported but can not
> > find it in the current development branch.
> >
> > The reason I;m asking is because Cassandra works well with the analytical
> > Apache Spark stack.
> >
> > Assume a scenario where you have e.g. the following Domain Models like :
> >
> > - Products
> > - Orders
> > - Users
> >
> > Each Domain has  its own Api, Usercases and States that is stored in the
> > DM. Now you have e.g. a Webshop UI on top of the
> > above Domains.
> >
> > Now you want to answer questions like : What kind of Users are buying
> > Product X. Or, find those Users that are most likely buying
> > Product X in the next Y days.
> >
> > To answer those questions is typically a challenge of "Data Analytics"
> > using algorithm like PCA, Random Forest, Regressions, XGBoost, etc.
> > All can be done surely in Java, but from my impression the Python
> community
> > built over the last years an amazing tool set and environments.
> >
> > Also a "Data Scientist"  has to try out different things, until a good
> and
> > robust prediction is done. So the workflow is interactive and here is
> where
> > Apache Spark is offering
> > great tools, including the usage of the IPython/Jupyter Notebooks.
> Another
> > benefit is that one does not need to kick-on any ETL Jobs to transfer the
> > transactional data from the Domain Models to the analytical world -
> > Cassandra does this already. So one can do all the analysis on a realtime
> > snapshot
> > without influencing the transactional processing.
> >
> > Thank you.
> >
> > Cheers,
> > Jiri
> >
>
>
>
> --
> Niclas Hedhman, Software Developer
> http://zest.apache.org - New Energy for Java
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message