cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Schubert Zhang <zson...@gmail.com>
Subject Re: Inserting files to Cassandra timeouts
Date Wed, 28 Apr 2010 16:56:14 GMT
I think your file (as cassandra column value) is too large.
And I also think Cassandra is not good at store files.

On Wed, Apr 28, 2010 at 10:24 PM, Jussi P?├Âri
<jussi@androidconsulting.com>wrote:

> new try, previous went to wrong place...
>
> Hi all,
>
> i'm trying to run a scenario of adding files from specific folder to
> cassandra. Now I have 64 files(about 15-20 MB per file) and overall of 1GB
> of data.
> I'm able to insert a round 40 files, but after that the cassandra goes to
> some GC loop and I finally get an timeout to the client.
> It is not going to OOM, but it just jams.
>
> Here is what I had last marks in log file:
> NFO [GC inspection] 2010-04-28 10:07:55,297 GCInspector.java (line 110) GC
> for ParNew: 232 ms, 25731128 reclaimed leaving 553241120 used; max is
> 4108386304
>  INFO [GC inspection] 2010-04-28 10:09:02,331 GCInspector.java (line 110)
> GC for ParNew: 2844 ms, 238909856 reclaimed leaving 1435582832 used; max is
> 4108386304
>  INFO [GC inspection] 2010-04-28 10:09:49,421 GCInspector.java (line 110)
> GC for ParNew: 30666 ms, 11185824 reclaimed leaving 1679795336 used; max is
> 4108386304
>  INFO [GC inspection] 2010-04-28 10:11:18,090 GCInspector.java (line 110)
> GC for ParNew: 895 ms, 17921680 reclaimed leaving 1589308456 used; max is
> 4108386304
>
>
>
> I think that I must have something wrong in my configurations or in how I
> use cassandra, because here people are inserting 10 times more stuff and it
> works.
>
> Column family I using:
> <ColumnFamily CompareWith="BytesType" Name="Standard1"/>
> Basically inserting with key name is "Folder_name" and column name is "file
> name" and value is the file content.
> I tried with Hector(mainly) and directly using thrift(insert and
> batch_mutate).
>
> In my case, the data does not need to readable immediately after insert,
> but I don't know it that helps in anyway.
>
>
> My environment :
> mac and/or linux, tested in both
> java 1.6.0_17
> Cassandra 0.6.1
>
>
>
>  <RpcTimeoutInMillis>60000</RpcTimeoutInMillis>
> <CommitLogRotationThresholdInMB>32</CommitLogRotationThresholdInMB>
> <RowWarningThresholdInMB>512</RowWarningThresholdInMB>
>  <SlicedBufferSizeInKB>32</SlicedBufferSizeInKB>
>  <FlushDataBufferSizeInMB>32</FlushDataBufferSizeInMB>
>  <FlushIndexBufferSizeInMB>8</FlushIndexBufferSizeInMB>
>  <ColumnIndexSizeInKB>64</ColumnIndexSizeInKB>
>  <MemtableThroughputInMB>64</MemtableThroughputInMB>
>  <BinaryMemtableThroughputInMB>256</BinaryMemtableThroughputInMB>
>  <MemtableOperationsInMillions>0.1</MemtableOperationsInMillions>
>  <MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes>
>  <ConcurrentReads>8</ConcurrentReads>
>  <ConcurrentWrites>32</ConcurrentWrites>
>  <CommitLogSync>batch</CommitLogSync>
>  <!-- CommitLogSyncPeriodInMS>10000</CommitLogSyncPeriodInMS -->
>  <CommitLogSyncBatchWindowInMS>1.0</CommitLogSyncBatchWindowInMS>
>  <GCGraceSeconds>500</GCGraceSeconds>
>
> JVM_OPTS=" \
>        -server \
>        -Xms3G \
>        -Xmx3G \
>        -XX:PermSize=512m \
>        -XX:MaxPermSize=800m \
>        -XX:MaxNewSize=256m \
>        -XX:NewSize=128m \
>        -XX:TargetSurvivorRatio=90 \
>        -XX:+AggressiveOpts \
>        -XX:+UseParNewGC \
>        -XX:+UseConcMarkSweepGC \
>        -XX:+CMSParallelRemarkEnabled \
>        -XX:+HeapDumpOnOutOfMemoryError \
>        -XX:SurvivorRatio=128 \
>        -XX:MaxTenuringThreshold=0 \
>        -XX:+DisableExplicitGC \
>        -Dcom.sun.management.jmxremote.port=8080 \
>        -Dcom.sun.management.jmxremote.ssl=false \
>        -Dcom.sun.management.jmxremote.authenticate=false"
>
>

Mime
View raw message