incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <moshe.kr...@barclays.com>
Subject RE: data model to store large volume syslog
Date Thu, 07 Mar 2013 12:16:50 GMT
Row key based on hour will create hot spots for write - for an entire hour, all the writes
will be going to the same node, i.e., the node where the row resides. You need to come up
with a row key that distributes writes evenly across all your C* nodes, e.g., time concatenated
with a sequence counter.

From: Mohan L [mailto:l.mohanphy@gmail.com]
Sent: Thursday, March 07, 2013 2:10 PM
To: user@cassandra.apache.org
Subject: data model to store large volume syslog


Dear All,

I am looking Cassandra to store time series data(mostly syslog). The volume of data is very
huge and more entries happening at the same timestamps. each record contain the following
fields.

timestamps:host-name:facility:message

The below are the things needs to be monitored:


1). Need to get data between time X and Y
2). Need to get data between time X and Y for a host-name.
3). Need to search a 'pattern' in the message

the data model design which I am thinking is

1). create a column family 'cfrawlog' which stores raw log as received. row key could be 'yyyyddmmhh'(new
row is added for each hour or less), each 'column name' is uuid with 'value' is raw log data.
Since we are also going to use this log for forensics purpose, so it will help us to have
all raw log with in the column family without missing.

2). I want to create one more column family which is going to have the parsed log so that
we will use this column family to query. my question is How to model this CF so that it will
give answer of the above question? what would be the row key for this CF?

3). Is the above data model makes sense?

Any help and suggestion would be greatly appreciated.


Thanks
Mohan L


_______________________________________________

This message may contain information that is confidential or privileged. If you are not an
intended recipient of this message, please delete it and any attachments, and notify the sender
that you have received it in error. Unless specifically stated in the message or otherwise
indicated, you may not uplicate, redistribute or forward this message or any portion thereof,
including any attachments, by any means to any other person, including any retail investor
or customer. This message is not a recommendation, advice, offer or solicitation, to buy/sell
any product or service, and is not an official confirmation of any transaction. Any opinions
presented are solely those of the author and do not necessarily represent those of Barclays.

This message is subject to terms available at: www.barclays.com/emaildisclaimer and, if received
from Barclays' Sales or Trading desk, the terms available at: www.barclays.com/salesandtradingdisclaimer/.
By messaging with Barclays you consent to the foregoing. Barclays Bank PLC is a company registered
in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP.
This email may relate to or be sent from other members of the Barclays group.

_______________________________________________

Mime
View raw message