On Sun, Nov 8, 2009 at 3:56 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
- You’ll easily double performance by setting the log level from DEBUG
to INFO (unclear if you actually did this, so mentioning it for
No problem I've check all is on INFO
- 0.4.1 has bad default GC options. the defaults will be fixed for
0.4.2 and 0.5, but it’s easy to tweak for 0.4.1:
Sorry I can't find the post talking about that I can't open this link on mac os

- it doesn't look like you're doing parallel inserts.  you should have
at least a few dozen to a few hundred threads if you want to measure
throughput rather than just latency.  run the client on a machine that
is not running cassandra, since it can also use a decent amount of
You mean by parallel to write a code running the insert into thread instead of one by one ?
If it's the case is the Thrift API are thread safe ?. Ho do you manage the opening and the close of the connection ? like single thread open one and closed at the end.
 - using batch_insert will be much faster than multiple single-column
inserts to the same row

I've made modification like this :
    public void insertChannelShow(String showId, String channelId, String airDate,  String duration, String title, String parentShowId, String genre, String price, String subtitle) throws Exception {
        Calendar calendar = Calendar.getInstance();
        Date air = dateFormat.parse(airDate);

        String key = String.valueOf(calendar.getTimeInMillis()) + ":" + showId + ":" + channelId;

        long timestamp = System.currentTimeMillis();
        Map<String, List<ColumnOrSuperColumn>> insertDataMap = new HashMap<String, List<ColumnOrSuperColumn>>();
        List<ColumnOrSuperColumn> rowData = new ArrayList<ColumnOrSuperColumn>();
        rowData.add(new ColumnOrSuperColumn(new Column(("duration").getBytes("UTF-8"), duration.getBytes("UTF-8"), timestamp), null));
        rowData.add(new ColumnOrSuperColumn(new Column(("title").getBytes("UTF-8"), title.getBytes("UTF-8"), timestamp), null));
        rowData.add(new ColumnOrSuperColumn(new Column(("parentShowId").getBytes("UTF-8"), parentShowId.getBytes("UTF-8"), timestamp), null));
        rowData.add(new ColumnOrSuperColumn(new Column(("genre").getBytes("UTF-8"), genre.getBytes("UTF-8"), timestamp), null));
        rowData.add(new ColumnOrSuperColumn(new Column(("price").getBytes("UTF-8"), price.getBytes("UTF-8"), timestamp), null));
        rowData.add(new ColumnOrSuperColumn(new Column(("subtitle").getBytes("UTF-8"), subtitle.getBytes("UTF-8"), timestamp), null));
        insertDataMap.put("channelShow", rowData);
        cassandraClient.batch_insert("Keyspace1", key, insertDataMap, ConsistencyLevel.ONE);
        insertDataMap = null;
        rowData = null;

Is it what you think about?

Anyway I've opened a new small instance in amazon to run the insert not one running cassandra and give one of the cassandra server ip. It's not improve nothing. The client machine is 1% CPU the server machines are 1% CPU.

The problem come when the data is distributed between the 2 cassandra servers because all the time the data go to commitlog of the first server all is ok ~2000 rows/second. But when the data goes to the second server it's falling very sharply ~200 rows /second.

I've read that I can check latency with JMX. it's ok but I can't succed to connect JMX agent on amazon the params are OK but nothing help the jconsole on my side refuse to connect. Is there something else I can check ?