cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan <>
Subject Re: Fixtures / CI docker
Date Mon, 26 Jan 2015 19:34:04 GMT
Hi Alain; 
The requirements are impossible to meet, since you are expected to have a predictable and
determinist tests  while you need "recent data" (max 1 week old data).Reason:   You cannot
have a replicable result set when the data is variable on a weekly basis.
To obtain a replicable test result, I recommend the following: a)   Keep the 'data' expectation
to a point in time which is a known quanta. b)   Load some data into your cluster &
take a snapshot.    Reload this snapshot before every Test for consistent results.   
hope this helps. 
Jan/C* Architect 

     On Monday, January 26, 2015 10:43 AM, Eric Stevens <> wrote:

 I don't have directly relevant advice, especially WRT getting a meaningful and coherent subset
of your production data - that's probably too closely coupled with your business logic. 
Perhaps you can run a testing cluster with a default TTL on all your tables of ~2 weeks, feeding
it with real production data so that you have a rolling current snapshot of production.
We do this basic strategy to support integration tests with the rest of our platform.  We
have a data access service with other internal teams acting as customers of that data.  But
it's hard to write strong tests against this, because it becomes challenging to predict the
values which you should expect to get back without rewriting the business logic directly into
your tests (and then what exactly are you testing, are you testing your tests?)
But our data interaction layer tests all focus around inserting the data under test immediately
before the assertions portion of the given test.  We use Specs2 as a testing framework, and
that gives us access to a very nice "eventually { ... }" syntax which will retry the assertions
portion several times with a backoff (so that we can account for the eventually consistent
nature of Cassandra, and reduce the number of false failures without having to do test execution
speed impacting operations like sleep before assert).
Basically our data access layer unit tests are strong and rely only on synthetic data (assert
that the response is exact for every value), while integration tests from other systems use
much softer tests against real data (more like is there data, and does that data seem to be
the right format and for the right time range).
On Mon, Jan 26, 2015 at 3:26 AM, Alain RODRIGUEZ <> wrote:

Hi guys,
We currently use a CI with tests based on docker containers.
We have a C* service "dockerized". Yet we have an issue since we would like 2 things, hard
to achieve:
- A fix data set to have predictable and determinist tests (that we can repeat at any time
with the same result)- A recent data set to perform smoke testing on things services that
need "recent data" (max 1 week old data)
As our dataset is very big and data is not sorted by dates in SSTable, it is hard to have
a coherent extract of the production data. Does anyone of you achieve to have something like
this ?
For "static" data, we could write queries by hand but I find it more relevant to have a real
production extract. Regarding dynamic data we need a process that we could repeat every day
/ week to update data and have something light enough to keep fastness in containers start.
How do you guys do this kind of things ?
FWIW we are migrating to 2.0.11 very soon so solutions might use 2.0 features.
Any idea is welcome and if you need more info, please ask.

View raw message