We are using the latest phpcassa
(phpcassa-0.7.a.2.tar.gz<https://github.com/downloads/thobbs/phpcassa/phpcassa-0.7.a.2.tar.gz>
) and cassandra 0.7.3,
we have inserted 12+ million documents into one column family with the
following keyspace/columnfamily settings:
Keyspace: dffl:
Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
Replication Factor: 4
Column Families:
ColumnFamily: product
Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
Row cache size / save period: 0.0/0
Key cache size / save period: 200000.0/14400
Memtable thresholds: 1.1578125/247/60
GC grace seconds: 864000
Compaction min/max thresholds: 4/32
Read repair chance: 1.0
Built indexes: []
It took 219 minutes to insert 12+ million docs which translates to about 913
docs/second using batch_insert in batches of 1250 documents per batch.
We have a cluster of 10 nodes all running 2 x Xeon 3.6ghz with 8GB memory
each and RAID 5 SCSI u320 and have expected better performance. We do have
the binary accel installed (actually, we went through the PHP client and
removed the $bin_accel checks so it always use the accel).
Does anybody have any ideas on the common gotcha's which could cause it to
be this slow?
|