incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Gabaydulin <gaba...@gmail.com>
Subject Physical data layout of columns in super column family
Date Wed, 09 Nov 2011 09:47:36 GMT
Hi, first of all, let me say thank you for the the amazing product :-)
So, I have a couple of questions about internal physical data layout.

Suppose, I have the following data schema:

Reports:{
    1:{
        1:{"value1":"some val", "value2":"some val"},
        2:{"value1":"some val", "value2":"some val"}
        ...
    },
    2:{
        1:{"value1":"some val", "value2":"some val"},
        2:{"value1":"some val", "value2":"some val"}
        ...
    }
    ...
}

An each report is represented by a set of report records.

Most of the data queries select report by id and all his report lines.
I'm going to use the multiget super slice query with ranges(in term of
Hector client) for it. Will it be efficient?

Another question related with physical layout of the data. I'm going
to apply SimpleStrategy with the random partitioner.
The replication factor is 1 or 2(it depends on numbers of nodes in the
production environment).
Can I get guarantees that all reports lines of one report will be
located on the same node in such configuration?

Mime
View raw message