cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matija Gobec <matija0...@gmail.com>
Subject Re: Help with data modelling (from MySQL to Cassandra)
Date Sun, 26 Mar 2017 20:35:36 GMT
Have one table hold document metadata (doc_id, title, description, ...) and
have another table elements where partition key is doc_id and clustering
key is element_id.
Only problem here is if you need to query and/or update element just by
element_id but I don't know your queries up front.

On Sun, Mar 26, 2017 at 10:16 PM, Zoltan Lorincz <zollor@gmail.com> wrote:

> Dear cassandra users,
>
> We have the following structure in MySql:
>
> documents->[doc_id(primary key), title, description]
> elements->[element_id(primary key), doc_id(index), title, description]
>
> Notation: table name->[column1(key or index), column2, …]
>
> We want to transfer the data to Cassandra.
>
> Each document can contain a large number of elements (between 1 and 100k+)
>
> We have two requirements:
> a) Load all elements for a given doc_id quickly
> b) Update the value of one individual element quickly
>
>
> We were thinking on the following cassandra configurations:
>
> Option A
>
> documents->[doc_id(primary key), title, description, elements] (elements
> could be a SET or a TEXT, each time new elements are added (they are never
> removed) we would append it to this column)
> elements->[element_id(primary key), title, description]
>
> Loading a document:
>
>  a) Load document with given <doc_id> and get all element ids
>     SELECT * from documents where doc_id=‘id’
>
>  b) Load all elements with the given ids
>     SELECT * FROM elements where element_id IN (ids loaded from query a)
>
>
> Option B
>
> documents->[doc_id(primary key), title, description]
> elements->[element_id(primary key), doc_id(secondary index), title,
> description]
>
> Loading a document:
>  a) SELECT * from elements where doc_id=‘id’
>
>
> Neither solutions doesn’t seem to be good, in Option A, even if we are
> querying by Primary keys, the second query will have 100k+ primary key id’s
> in the WHERE clause, and the second solution looks like an anti pattern in
> cassandra.
>
> Could anyone give any advice how would we create a model for our use case?
>
> Thank you in advance,
> Zoltan.
>
>

Mime
View raw message