incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Labrot <nith...@gmail.com>
Subject Re: Cassandra tuning for running test on a desktop
Date Thu, 22 Apr 2010 08:47:55 GMT
Yes I think. I have read this wiki entry and the JIRA. I will use different
row key until it will be fixed

Thanks,

Nicolas


On Thu, Apr 22, 2010 at 4:47 AM, Stu Hood <stu.hood@rackspace.com> wrote:

> Nicolas,
>
> Were all of those super column writes going to the same row?
> http://wiki.apache.org/cassandra/CassandraLimitations
>
> Thanks,
> Stu
>
> -----Original Message-----
> From: "Nicolas Labrot" <nithril@gmail.com>
> Sent: Wednesday, April 21, 2010 11:54am
> To: user@cassandra.apache.org
> Subject: Re: Cassandra tuning for running test on a desktop
>
> I donnot have a website ;)
>
> I'm testing the viability of Cassandra to store XML documents and make fast
> search queries. 4000 XML files (80MB of XML) create with my datamodel (one
> SC per XML node) 1000000 SC which make Cassandra go OOM with Xmx 1GB. On
> the
> contrary an xml DB like eXist handles 4000 XML doc without any problem with
> an acceptable amount of memories.
>
> What I like with Cassandra is his simplicity and his scalability. eXist is
> not able to scale with data, the only viable solution his marklogic which
> cost an harm and a feet... :)
>
> I will install linux and buy some memories to continue my test.
>
> Could a Cassandra developper give me the technical reason of this OOM ?
>
>
>
>
>
> On Wed, Apr 21, 2010 at 5:13 PM, Mark Greene <greenemj@gmail.com> wrote:
>
> > Maybe, maybe not. Presumably if you are running a RDMS with any
> reasonable
> > amount of traffic now a days, it's sitting on a machine with 4-8G of
> memory
> > at least.
> >
> >
> > On Wed, Apr 21, 2010 at 10:48 AM, Nicolas Labrot <nithril@gmail.com
> >wrote:
> >
> >> Thanks Mark.
> >>
> >> Cassandra is maybe too much for my need ;)
> >>
> >>
> >>
> >> On Wed, Apr 21, 2010 at 4:45 PM, Mark Greene <greenemj@gmail.com>
> wrote:
> >>
> >>> Hit send to early....
> >>>
> >>> That being said a lot of people running Cassandra in production are
> using
> >>> 4-6GB max heaps on 8GB machines, don't know if that helps but hopefully
> >>> gives you some perspective.
> >>>
> >>>
> >>> On Wed, Apr 21, 2010 at 10:39 AM, Mark Greene <greenemj@gmail.com
> >wrote:
> >>>
> >>>> RAM doesn't necessarily need to be proportional but I would say the
> >>>> number of nodes does. You can't just throw a bazillion inserts at one
> node.
> >>>> This is the main benefit of Cassandra is that if you start hitting
> your
> >>>> capacity, you add more machines and distribute the keys across more
> >>>> machines.
> >>>>
> >>>>
> >>>> On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot <nithril@gmail.com
> >wrote:
> >>>>
> >>>>> So does it means the RAM needed is proportionnal with the data
> handled
> >>>>> ?
> >>>>>
> >>>>> Or Cassandra need a minimum amount or RAM when dataset is big?
> >>>>>
> >>>>> I must confess this OOM behaviour is strange.
> >>>>>
> >>>>>
> >>>>> On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones <MJones@imagehawk.com
> >wrote:
> >>>>>
> >>>>>>  On my 4GB machine I’m giving it 3GB and having no trouble
with 60+
> >>>>>> million 500 byte columns
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> *From:* Nicolas Labrot [mailto:nithril@gmail.com]
> >>>>>> *Sent:* Wednesday, April 21, 2010 7:47 AM
> >>>>>> *To:* user@cassandra.apache.org
> >>>>>> *Subject:* Re: Cassandra tuning for running test on a desktop
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> I have try 1400M, and Cassandra OOM too.
> >>>>>>
> >>>>>> Is there another solution ? My data isn't very big.
> >>>>>>
> >>>>>> It seems that is the merge of the db
> >>>>>>
> >>>>>>  On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene <greenemj@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>> Trying increasing Xmx. 1G is probably not enough for the amount
of
> >>>>>> inserts you are doing.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot <nithril@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> For my first message I will first thanks Cassandra contributors
for
> >>>>>> their great works.
> >>>>>>
> >>>>>> I have a parameter issue with Cassandra (I hope it's just a
> parameter
> >>>>>> issue). I'm using Cassandra 6.0.1 with Hector client on my desktop.
> It's a
> >>>>>> simple dual core with 4GB of RAM on WinXP. I have keep the default
> JVM
> >>>>>> option inside cassandra.bat (Xmx1G)
> >>>>>>
> >>>>>> I'm trying to insert 3 millions of SC with 6 Columns each inside
1
> CF
> >>>>>> (named Super1). The insertion go to 1 millions of SC (without
> slowdown) and
> >>>>>> Cassandra crash because of an OOM. (I store an average of 100
bytes
> per SC
> >>>>>> with a max of 10kB).
> >>>>>> I have aggressively decreased all the memories parameters without
> any
> >>>>>> respect to the consistency (My config is here [1]), the cache
is
> turn off
> >>>>>> but Cassandra still go to OOM. I have joined the last line of
the
> Cassandra
> >>>>>> life [2].
> >>>>>>
> >>>>>> What can I do to fix my issue ?  Is there another solution than
> >>>>>> increasing the Xmx ?
> >>>>>>
> >>>>>> Thanks for your help,
> >>>>>>
> >>>>>> Nicolas
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> [1]
> >>>>>>   <Keyspaces>
> >>>>>>     <Keyspace Name="Keyspace1">
> >>>>>>       <ColumnFamily Name="Super1"
> >>>>>>                     ColumnType="Super"
> >>>>>>                     CompareWith="BytesType"
> >>>>>>                     CompareSubcolumnsWith="BytesType" />
> >>>>>>
> >>>>>>
> <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
> >>>>>>       <ReplicationFactor>1</ReplicationFactor>
> >>>>>>
> >>>>>>
> <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
> >>>>>>     </Keyspace>
> >>>>>>   </Keyspaces>
> >>>>>>
> <CommitLogRotationThresholdInMB>32</CommitLogRotationThresholdInMB>
> >>>>>>
> >>>>>>   <DiskAccessMode>auto</DiskAccessMode>
> >>>>>>   <RowWarningThresholdInMB>64</RowWarningThresholdInMB>
> >>>>>>   <SlicedBufferSizeInKB>64</SlicedBufferSizeInKB>
> >>>>>>   <FlushDataBufferSizeInMB>16</FlushDataBufferSizeInMB>
> >>>>>>   <FlushIndexBufferSizeInMB>4</FlushIndexBufferSizeInMB>
> >>>>>>   <ColumnIndexSizeInKB>64</ColumnIndexSizeInKB>
> >>>>>>
> >>>>>>   <MemtableThroughputInMB>16</MemtableThroughputInMB>
> >>>>>>   <BinaryMemtableThroughputInMB>32</BinaryMemtableThroughputInMB>
> >>>>>>   <MemtableOperationsInMillions>0.01</MemtableOperationsInMillions>
> >>>>>>
> <MemtableObjectCountInMillions>0.01</MemtableObjectCountInMillions>
> >>>>>>   <MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes>
> >>>>>>   <ConcurrentReads>4</ConcurrentReads>
> >>>>>>   <ConcurrentWrites>8</ConcurrentWrites>
> >>>>>> </Storage>
> >>>>>>
> >>>>>>
> >>>>>> [2]
> >>>>>>  INFO 13:36:41,062 Super1 has reached its threshold; switching
in a
> >>>>>> fresh Memtable at
> >>>>>>
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> >>>>>> position=5417524)
> >>>>>>  INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
> >>>>>>  INFO 13:36:41,062 Writing Memtable(Super1)@15385755
> >>>>>>  INFO 13:36:42,062 Completed flushing
> >>>>>> d:\cassandra\data\Keyspace1\Super1-711-Data.db
> >>>>>>  INFO 13:36:45,781 Super1 has reached its threshold; switching
in a
> >>>>>> fresh Memtable at
> >>>>>>
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> >>>>>> position=6065637)
> >>>>>>  INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910
> >>>>>>  INFO 13:36:45,796 Writing Memtable(Super1)@15578910
> >>>>>>  INFO 13:36:46,109 Completed flushing
> >>>>>> d:\cassandra\data\Keyspace1\Super1-712-Data.db
> >>>>>>  INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240
> >>>>>> reclaimed leaving 922392600 used; max is 1174208512
> >>>>>>  INFO 13:36:54,593 Super1 has reached its threshold; switching
in a
> >>>>>> fresh Memtable at
> >>>>>>
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> >>>>>> position=6722241)
> >>>>>>  INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872
> >>>>>>  INFO 13:36:54,593 Writing Memtable(Super1)@24468872
> >>>>>>  INFO 13:36:55,421 Completed flushing
> >>>>>>
> d:\cassandra\data\Keyspace1\Super1-713-Data.dbjava.lang.OutOfMemoryError:
> >>>>>> Java heap space
> >>>>>>  INFO 13:37:08,281 GC for ConcurrentMarkSweep: 5561 ms, 9432
> reclaimed
> >>>>>> leaving 971904520 used; max is 1174208512
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>
>
>

Mime
View raw message