Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 99092 invoked from network); 1 Dec 2010 13:32:03 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Dec 2010 13:32:03 -0000 Received: (qmail 50209 invoked by uid 500); 1 Dec 2010 13:32:01 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 50021 invoked by uid 500); 1 Dec 2010 13:32:01 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 50012 invoked by uid 99); 1 Dec 2010 13:32:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Dec 2010 13:32:00 +0000 X-ASF-Spam-Status: No, hits=3.7 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ivytang0812@gmail.com designates 209.85.214.172 as permitted sender) Received: from [209.85.214.172] (HELO mail-iw0-f172.google.com) (209.85.214.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Dec 2010 13:31:55 +0000 Received: by iwn40 with SMTP id 40so8827907iwn.31 for ; Wed, 01 Dec 2010 05:31:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=ItJ9y549Ht1dAvavRocMUfG9dgC1GnBy/hJZBXNJlkU=; b=aVTpy/4sgGxt44j761wo8kO6EcAVcHAYdB2zZtFtQuCeRehHfwHX0wXFmwOSWY727W Z9ga9kkNuhAsFtpIzSvlQuIWKeHOVnrCkz4uYjSQcY08xOPd5n2q+cuyLvn/zEF8VoBF 3jssCrkKP4EbSyUtcaPLN7G1Ns4fBz4/tXhlA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=KIscKaxFKWsQ0Ala9ayFCh7FNXauiRhrrF6u0/yyrFbDSV/helhFqGhQX0ADGUQzKv pJrfUgev8Bs9E9ov3cRr5q+jS5bIves4YiYj9ImpKNv5//I+58TohmQka0u7n74L52Kb dy+8Vo7yReZSl2B4zXr3ILslKLLK8oRwbZRZg= Received: by 10.231.144.197 with SMTP id a5mr9045608ibv.61.1291210294997; Wed, 01 Dec 2010 05:31:34 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.184.147 with HTTP; Wed, 1 Dec 2010 05:31:14 -0800 (PST) In-Reply-To: References: From: Ying Tang Date: Wed, 1 Dec 2010 21:31:14 +0800 Message-ID: Subject: Re: When to call the major compaction ? To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001485eafcbef6b3eb0496595276 --001485eafcbef6b3eb0496595276 Content-Type: text/plain; charset=ISO-8859-1 I'm confused , plz ingore the mail above. Here is my confusion , posterior to 0.6.6/0.7 , minor compaction and major compaction both can clean out rows 'tagged' tombstones , and generate a new , without tombstones , sstable . And the tombstones remains in memory ,waiting to be removed by jvm gc . Am i right? On Wed, Dec 1, 2010 at 9:10 PM, Ying Tang wrote: > 1. So posterior to 0.6.6/0.7 , minor compaction and major compaction both > can clean out rows 'tagged' tombstones , this kind of clean out doesn't > mead remove it from the disk permanently. > The real remove is done by the jvm GC ? > 2. The intence of compaction is merging multi sstables into one , clean out > the tombstone , let the un-tombstones rows be into a new ordered sstable ? > > > > On Wed, Dec 1, 2010 at 7:30 PM, Sylvain Lebresne wrote: > >> On Wed, Dec 1, 2010 at 12:11 PM, Ying Tang wrote: >> > And i have another question , what's the difference between minor >> > compaction and major compaction? >> >> A major compaction is a compaction that compact *all* the SSTables of a >> given >> column family (compaction compacts one CF at a time). >> >> Before https://issues.apache.org/jira/browse/CASSANDRA-1074 >> (introduced in 0.6.6 and >> recent 0.7 betas/rcs), major compactions where the only ones that removed >> the >> tombstones (see http://wiki.apache.org/cassandra/DistributedDeletes) >> and this is the >> reason major compaction exists. Now, with #1074, minor compactions >> should remove most >> if not all tombstones, so major compaction are not or much less useful >> (it may depend on your >> workload though as minor can't always delete the tombstones). >> >> -- >> Sylvain >> >> > >> > On 12/1/10, Chen Xinli wrote: >> >> 2010/12/1 Ying Tang >> >> >> >>> Every time cassandra creates a new sstable , it will call the >> >>> CompactionManager.submitMinorIfNeeded ? And if the number of >> memtables is >> >>> beyond MinimumCompactionThreshold , the minor compaction will be >> called. >> >>> And there is also a method named CompactionManager.submitMajor , and >> the >> >>> call relationship is : >> >>> >> >>> NodeCmd -- > NodeProbe -->StorageService.forceTableCompaction --> >> >>> Table.forceCompaction -->CompactionManager.performMajor --> >> >>> CompactionManager.submitMajor >> >>> >> >>> ColumnFamilyStore.forceMajorCompaction --> >> CompactionManager.performMajor >> >>> --> CompactionManager.submitMajor >> >>> >> >>> >> >>> HintedHandOffManager >> >>> --> CompactionManager.submitMajor >> >>> >> >>> So i have 3 questions: >> >>> 1. Once a new sstable has been created , >> >>> CompactionManager.submitMinorIfNeeded will be called , >> minorCompaction >> >>> maybe called . >> >>> But when will the majorCompaction be called ? Just the NodeCmd ? >> >>> >> >> >> >> Yes, majorCompaction must be called manually from NodeCmd >> >> >> >> >> >>> 2. Which jobs will minorCompaction and majorCompaction do ? >> >>> Will minorCompaction delete the data that have been marked as >> deleted >> >>> ? >> >>> And how about the major compaction ? >> >>> >> >> >> >> Compaction only mark sstables as deleted. Deletion will be done when >> there >> >> are full gc, or node restarted. >> >> >> >> >> >>> 3. When gc be called ? Every time compaction been called? >> >>> >> >> >> >> GC has nothing to do with compaction, you may mistake the two >> conceptions >> >> >> >> >> >>> >> >>> >> >>> >> >>> -- >> >>> Best regards, >> >>> >> >>> Ivy Tang >> >>> >> >>> >> >>> >> >>> >> >> >> >> >> >> -- >> >> Best Regards, >> >> Chen Xinli >> >> >> > >> > >> > -- >> > Best regards, >> > >> > Ivy Tang >> > >> > > > > -- > Best regards, > > Ivy Tang > > > > -- Best regards, Ivy Tang --001485eafcbef6b3eb0496595276 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I'm confused , plz ingore the mail above.
Here is my confusion ,
=A0=A0 posterior to 0.6.6/0.7=A0 , minor compaction and major compacti= on both=A0 can clean out rows 'tagged'=A0 tombstones=A0=A0, and gen= erate a new , without tombstones , sstable .
=A0=A0=A0 And the tombstones remains in memory ,waiting to be removed = by jvm gc .
Am i right?

On Wed, Dec 1, 2010 at 9:10 PM, Ying Tang <ivytang0812@gmail.= com> wrote:
1. So posterior to 0.6.6/0.7 ,=A0 minor compaction and major compactio= n both=A0 can clean out rows 'tagged'=A0 tombstones , this kind of = clean out doesn't mead remove it from the disk permanently.
=A0=A0=A0 The real remove is done by the jvm GC ?
2. The intence of compaction is merging multi sstables into one , clea= n out the tombstone , let the un-tombstones=A0=A0rows=A0be into a new order= ed sstable ?
=A0
=A0
=A0
On Wed, Dec 1, 2010 at 7:30 PM, Sylvain Lebresne <= ;sylvain@yakaz.com> wrote:
On Wed, Dec 1, 2010 at 12:11 PM, Ying Tang <ivytang0812@gmail.com> wrote:> And i have another question , what's the difference between minor=
> compaction and major compaction?

A major compaction is a = compaction that compact *all* the SSTables of a given
column family (com= paction compacts one CF at a time).

Before https://issues.ap= ache.org/jira/browse/CASSANDRA-1074
(introduced in 0.6.6 and
recent 0.7 betas/rcs), major compactions where = the only ones that removed the
tombstones (see http://wiki.apache= .org/cassandra/DistributedDeletes)
and this is the
reason major compaction exists. Now, with #1074, minor c= ompactions
should remove most
if not all tombstones, so major compact= ion are not or much less useful
(it may depend on your
workload thoug= h as minor can't always delete the tombstones).

--
Sylvain

>
> On 12/1/10, Chen Xinli <chen.daqi@gmail.com> wrote:
>&g= t; 2010/12/1 Ying Tang <ivytang0812@gmail.com>
>>
>>> Every time cassandra creates a new sstable , it wi= ll call the
>>> CompactionManager.submitMinorIfNeeded =A0? And = if the number of memtables is
>>> beyond =A0MinimumCompactionTh= reshold =A0, the minor compaction will be called.
>>> And there is also a method named CompactionManager.submitMajor= , and the
>>> call relationship is :
>>>
>&g= t;> NodeCmd -- > NodeProbe -->StorageService.forceTableCompaction = -->
>>> Table.forceCompaction -->CompactionManager.performMajor --&= gt;
>>> CompactionManager.submitMajor
>>>
>&g= t;> ColumnFamilyStore.forceMajorCompaction --> CompactionManager.perf= ormMajor
>>> --> CompactionManager.submitMajor
>>>
>&g= t;>
>>> HintedHandOffManager
>>> =A0--> Compa= ctionManager.submitMajor
>>>
>>> So i have 3 questi= ons:
>>> 1. Once a new sstable has been created ,
>>> Compa= ctionManager.submitMinorIfNeeded =A0will be called , minorCompaction
>= ;>> maybe called .
>>> =A0 =A0 But when will the majorCom= paction be called ? Just the NodeCmd ?
>>>
>>
>> Yes, majorCompaction must be called ma= nually from NodeCmd
>>
>>
>>> 2. Which jobs w= ill minorCompaction and majorCompaction do ?
>>> =A0 =A0 Will m= inorCompaction delete the data that have been marked as deleted
>>> ?
>>> =A0 =A0 And how about the major compaction ?=
>>>
>>
>> Compaction only mark sstables as d= eleted. Deletion will be done when there
>> are full gc, or node r= estarted.
>>
>>
>>> 3. When gc be called ? Every time comp= action been called?
>>>
>>
>> GC has nothing = to do with compaction, you may mistake the two conceptions
>>
>>
>>>
>>>
>>>
>>> --=
>>> Best regards,
>>>
>>> Ivy Tang
= >>>
>>>
>>>
>>>
>> >>
>> --
>> Best Regards,
>> Chen Xinli>>
>
>
> --
> Best regards,
>
> = Ivy Tang
>



--
Best regards,

Ivy Tang






-- <= br>
Best regards,

Ivy Tang



--001485eafcbef6b3eb0496595276--