incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard grossman <richie...@gmail.com>
Subject Re: Time to insert bulk data is very high comparing to database
Date Sun, 08 Nov 2009 13:41:13 GMT
Hi

Actually we run on amazon EC2 large instance = 7.5 GB memory and we don't
use ESB only local disk as /mnt
here is my code o insert the data :

    public void insertChannelShow(String showId, String channelId, String
airDate,  String duration, String title, String parentShowId, String genre,
String price, String subtitle) throws Exception {
        Calendar calendar = Calendar.getInstance();
        dateFormat.setCalendar(calendar);
        Date air = dateFormat.parse(airDate);
        calendar.setTime(air);

        String key = String.valueOf(calendar.getTimeInMillis()) + ":" +
showId + ":" + channelId;

        long timestamp = System.currentTimeMillis();
        cassandraClient.insert("Keyspace1",
                key,
                new ColumnPath("channelShow", null,
("duration").getBytes("UTF-8")),
                duration.getBytes("UTF-8"),
                timestamp,
                ConsistencyLevel.ONE);
        cassandraClient.insert("Keyspace1",
                key,
                new ColumnPath("channelShow", null,
("title").getBytes("UTF-8")),
                title.getBytes("UTF-8"),
                timestamp,
                ConsistencyLevel.ONE);
        cassandraClient.insert("Keyspace1",
                key,
                new ColumnPath("channelShow", null,
("parentShowId").getBytes("UTF-8")),
                parentShowId.getBytes("UTF-8"),
                timestamp,
                ConsistencyLevel.ONE);
        cassandraClient.insert("Keyspace1",
                key,
                new ColumnPath("channelShow", null,
("genre").getBytes("UTF-8")),
                genre.getBytes("UTF-8"),
                timestamp,
                ConsistencyLevel.ONE);
        cassandraClient.insert("Keyspace1",
                key,
                new ColumnPath("channelShow", null,
("price").getBytes("UTF-8")),
                price.getBytes("UTF-8"),
                timestamp,
                ConsistencyLevel.ONE);
        cassandraClient.insert("Keyspace1",
                key,
                new ColumnPath("channelShow", null,
("subtitle").getBytes("UTF-8")),
                subtitle.getBytes("UTF-8"),
                timestamp,
                ConsistencyLevel.ONE);
    }

of course I've initialized my connection like this :
        tr = new TSocket(server, 9160);
        TProtocol proto = new TBinaryProtocol(tr);
        cassandraClient = new Client(proto);
        tr.open();

I've actually 2 machine on amazon EC2. 1 large from here I run the insert
data process and cassandra. The second machine just run cassandra but it's
on small instance just 2GB memory.

Thanks


On Sat, Nov 7, 2009 at 12:05 AM, Michael Greene <michael.greene@gmail.com>wrote:

> On Fri, Nov 6, 2009 at 10:54 AM, Richard grossman <richiesgr@gmail.com>
> wrote:
> > I know the test is not very accurate because the cassandra and oracle db
> > doesn't run on the same hardware but there is really a big difference.
> Do they run on comparable hardware?  Hardware specs + configuration
> have a clear impact on Cassandra performance -- what's your
> environment like?  This is slow even for a recent laptop though, so
> there's probably something else wrong.
>
> Michael
>

Mime
View raw message