polygene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niclas Hedhman <nic...@hedhman.org>
Subject Re: Apache Zest / Apache Cassandra
Date Sun, 21 Feb 2016 00:52:07 GMT
About Cassandra...

I think the only reason was that with CQL no one took the time to refactor
the code, perhaps due to some conceptual changes were introduced.
But it could have been that there were no true test suite, and failing the
Release Criteria and fixing "run embedded during test" with the same client
as in production code, may have been non-trivial (not sure). You have old
code in the sandbox.

Useful link;
http://prettyprint.me/prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/index.html
But I am uncertain if it is still relevant.

Sandbox;
https://github.com/apache/zest-sandbox/tree/master/extensions/entitystore-cassandra

Niclas

On Sun, Feb 21, 2016 at 3:38 AM, Jiri Jetmar <juergen.jetmar@gmail.com>
wrote:

> Hi guys,
>
> what is the status of the Apache Cassandra Entity Store ? Somehow I can
> remember that Cassandra was supported but can not
> find it in the current development branch.
>
> The reason I;m asking is because Cassandra works well with the analytical
> Apache Spark stack.
>
> Assume a scenario where you have e.g. the following Domain Models like :
>
> - Products
> - Orders
> - Users
>
> Each Domain has  its own Api, Usercases and States that is stored in the
> DM. Now you have e.g. a Webshop UI on top of the
> above Domains.
>
> Now you want to answer questions like : What kind of Users are buying
> Product X. Or, find those Users that are most likely buying
> Product X in the next Y days.
>
> To answer those questions is typically a challenge of "Data Analytics"
> using algorithm like PCA, Random Forest, Regressions, XGBoost, etc.
> All can be done surely in Java, but from my impression the Python community
> built over the last years an amazing tool set and environments.
>
> Also a "Data Scientist"  has to try out different things, until a good and
> robust prediction is done. So the workflow is interactive and here is where
> Apache Spark is offering
> great tools, including the usage of the IPython/Jupyter Notebooks. Another
> benefit is that one does not need to kick-on any ETL Jobs to transfer the
> transactional data from the Domain Models to the analytical world -
> Cassandra does this already. So one can do all the analysis on a realtime
> snapshot
> without influencing the transactional processing.
>
> Thank you.
>
> Cheers,
> Jiri
>



-- 
Niclas Hedhman, Software Developer
http://zest.apache.org - New Energy for Java

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message