incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sasha Dolgy <>
Subject Re: Designing a decent data model for an online music shop...confused/stuck on decisions
Date Mon, 07 Mar 2011 15:34:39 GMT
phpcassa:  is maintained and works well


On Mon, Mar 7, 2011 at 4:22 PM, Courtney <> wrote:

>  Thanks for the response, I haven't checked on the status of phpcassa in a
> while but does it now work with 0.7?
> That was one of the main reasons I switched to pandra, it seemed more up to
> date
>  *From:* Tyler Hobbs <>
> *Sent:* Monday, March 07, 2011 2:40 AM
> *To:*
> *Subject:* Re: Designing a decent data model for an online music
> shop...confused/stuck on decisions
> Regarding PHP performance with Cassandra, THRIFT-638<>was
recently resolved and it shows some big performance improvements.  I'll
> be upgrading the Thrift package that ships with phpcassa soon to include
> this fix, so you may want to compare performance numbers before and after.
> On Sun, Mar 6, 2011 at 8:03 PM, Courtney <> wrote:
>>  We're in a bit of a predicament, we have an e-music store currently
>> built in PHP using codeigniter/mysql...
>> The current system has 100+K users and a decent song collection. Over the
>> last few months I've been playing with
>> Cassandra... needless to say I'm impressed but I have a few questions.
>> Firstly, I want to avoid re-writing the entire site if possible so my
>> instincts have made me inclined to replace the database layer
>> in code igniter... is this something anyone would recommend and are there
>> any gotchas in doing that?
>> I can't say I've been terribly happy with PHP accessing cassandra, when
>> sample data of the same size was put into mysql and in cassandra (of the
>> same size/type)
>> The pages with php connecting to Cassandra took longer to load, (30K
>> records in table).
>> I've thought maybe it was my setup that needed tweaking and I've played
>> with as many a options as I could but the best I've gotten is matching query
>> time.
>> Query speed test was simply getting time stamps right before and after
>> query call returned...
>> Is this something anyone else has seen, any comments suggestions? I've
>> tried using thrift, phpcassa and pandra with pretty similar numbers.
>> My other thought turned to maybe it was the way I designed my CFs, at
>> first I used super columns to model user account CF based on a post I read
>> by Arin (WTF is a super column) but I later changed to using normal CFs.
>> I'm trying to make this work but I get the feeling my approach is
>> somewhat...I don't mis-guided.
>> Here's a break down of the current model.
>>     CF:Users{
>>                 uid
>>                 fname
>>                 lname
>>                 username
>>                 password
>>                 street
>>                 ....
>>             }
>> Some additional columns in place for a user but keeping it simple...
>> CF:Library{
>>                 uid
>>                 songid
>>                 ...
>>                 other info about user library
>>                 }
>> CF:Songs{
>>                 songid
>>                 title
>>                 artistid
>>                 }
>> This all is still very relational like (considering I go on to have a CF
>> for playlist and artists) and I'm not sure if this is a good design for the
>> data but... when I looked into
>> combining some of the info and removing some CFs I run into the issue of
>> replicating data all over the place. If for example I stored the artist name
>> in the library for each record
>> then each then the artist would be replicated for every song they have for
>> every user who has that song in their library....
>> Where do you sort of draw the line on deciding how much is okay to be
>> replicated?
>> As much as I am not liking the idea of building the application from
>> scratch, I'm considering the possibility of building from scratch in
>> Java/JSP just to get the benefit of using
>> the hector client. (Efforts from the guys doing the PHP libs is much
>> appreciated but PHP doesn't seem to go too well with Cas.)
>> In the process of making decisions because the upgrade/rebuild needs to
>> have a fairly steady working version for October and I don't want to go
>> wrong before even starting.
>> Recommendations. Suggestions, advice are all welcomed (Any experience with
>> PHP and Cas. is also welcomed since all my fav. libs. are in PHP I'm
>> reluctant to turn away)
> --
> Tyler Hobbs
> Software Engineer, DataStax <>
> Maintainer of the pycassa <> Cassandra
> Python client library

Sasha Dolgy

View raw message