Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 94968 invoked from network); 1 Dec 2010 13:11:11 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Dec 2010 13:11:11 -0000 Received: (qmail 8125 invoked by uid 500); 1 Dec 2010 13:11:06 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 7897 invoked by uid 500); 1 Dec 2010 13:11:05 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 7586 invoked by uid 99); 1 Dec 2010 13:11:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Dec 2010 13:11:04 +0000 X-ASF-Spam-Status: No, hits=3.7 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ivytang0812@gmail.com designates 209.85.214.172 as permitted sender) Received: from [209.85.214.172] (HELO mail-iw0-f172.google.com) (209.85.214.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Dec 2010 13:10:57 +0000 Received: by iwn40 with SMTP id 40so8806048iwn.31 for ; Wed, 01 Dec 2010 05:10:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=Dim1gpWhmET6UoECGV1kKN+MxL5dttYHzNj4oAv9+DM=; b=bI2ZMz7i5bHk1yYrwRtoj6/XQpa4/ibKcmOcuqpFya7s9Wf3eCFrs0/1rxmJ7M5kbU nePk+gc5dElJbtSONtzp1VkFFxxzkx3Ol80EgjFIp8EBHN6mx1Gm/FozE9WGL1guWQ/9 ZYJj6CO6jPEJ/Z/w+qvdYp2PXfDu0M6YWej+Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=TkXZHSJUeZAC0S1nhn+3HLzZFT0NwvI//O9QvEE2UMEU4YALAk9jO3RQtfzoyPqVYJ qPXufuJLFVsWc9pie0MSr3ZhKv4AT+rjuTNAp46I9XZxBY0KyjAQHMYLnLZIEmadzOz+ BjGaKprGbaDApFUvPgju3ldBeZcmcGmGqbInQ= Received: by 10.231.19.8 with SMTP id y8mr1142339iba.111.1291209036310; Wed, 01 Dec 2010 05:10:36 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.184.147 with HTTP; Wed, 1 Dec 2010 05:10:15 -0800 (PST) In-Reply-To: References: From: Ying Tang Date: Wed, 1 Dec 2010 21:10:15 +0800 Message-ID: Subject: Re: When to call the major compaction ? To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=002215048103f0c8ed04965907b2 X-Virus-Checked: Checked by ClamAV on apache.org --002215048103f0c8ed04965907b2 Content-Type: text/plain; charset=ISO-8859-1 1. So posterior to 0.6.6/0.7 , minor compaction and major compaction both can clean out rows 'tagged' tombstones , this kind of clean out doesn't mead remove it from the disk permanently. The real remove is done by the jvm GC ? 2. The intence of compaction is merging multi sstables into one , clean out the tombstone , let the un-tombstones rows be into a new ordered sstable ? On Wed, Dec 1, 2010 at 7:30 PM, Sylvain Lebresne wrote: > On Wed, Dec 1, 2010 at 12:11 PM, Ying Tang wrote: > > And i have another question , what's the difference between minor > > compaction and major compaction? > > A major compaction is a compaction that compact *all* the SSTables of a > given > column family (compaction compacts one CF at a time). > > Before https://issues.apache.org/jira/browse/CASSANDRA-1074 > (introduced in 0.6.6 and > recent 0.7 betas/rcs), major compactions where the only ones that removed > the > tombstones (see http://wiki.apache.org/cassandra/DistributedDeletes) > and this is the > reason major compaction exists. Now, with #1074, minor compactions > should remove most > if not all tombstones, so major compaction are not or much less useful > (it may depend on your > workload though as minor can't always delete the tombstones). > > -- > Sylvain > > > > > On 12/1/10, Chen Xinli wrote: > >> 2010/12/1 Ying Tang > >> > >>> Every time cassandra creates a new sstable , it will call the > >>> CompactionManager.submitMinorIfNeeded ? And if the number of memtables > is > >>> beyond MinimumCompactionThreshold , the minor compaction will be > called. > >>> And there is also a method named CompactionManager.submitMajor , and > the > >>> call relationship is : > >>> > >>> NodeCmd -- > NodeProbe -->StorageService.forceTableCompaction --> > >>> Table.forceCompaction -->CompactionManager.performMajor --> > >>> CompactionManager.submitMajor > >>> > >>> ColumnFamilyStore.forceMajorCompaction --> > CompactionManager.performMajor > >>> --> CompactionManager.submitMajor > >>> > >>> > >>> HintedHandOffManager > >>> --> CompactionManager.submitMajor > >>> > >>> So i have 3 questions: > >>> 1. Once a new sstable has been created , > >>> CompactionManager.submitMinorIfNeeded will be called , minorCompaction > >>> maybe called . > >>> But when will the majorCompaction be called ? Just the NodeCmd ? > >>> > >> > >> Yes, majorCompaction must be called manually from NodeCmd > >> > >> > >>> 2. Which jobs will minorCompaction and majorCompaction do ? > >>> Will minorCompaction delete the data that have been marked as > deleted > >>> ? > >>> And how about the major compaction ? > >>> > >> > >> Compaction only mark sstables as deleted. Deletion will be done when > there > >> are full gc, or node restarted. > >> > >> > >>> 3. When gc be called ? Every time compaction been called? > >>> > >> > >> GC has nothing to do with compaction, you may mistake the two > conceptions > >> > >> > >>> > >>> > >>> > >>> -- > >>> Best regards, > >>> > >>> Ivy Tang > >>> > >>> > >>> > >>> > >> > >> > >> -- > >> Best Regards, > >> Chen Xinli > >> > > > > > > -- > > Best regards, > > > > Ivy Tang > > > -- Best regards, Ivy Tang --002215048103f0c8ed04965907b2 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
1. So posterior to 0.6.6/0.7 ,=A0 minor compaction and major compactio= n both=A0 can clean out rows 'tagged'=A0 tombstones , this kind of = clean out doesn't mead remove it from the disk permanently.
=A0=A0=A0 The real remove is done by the jvm GC ?
2. The intence of compaction is merging multi sstables into one , clea= n out the tombstone , let the un-tombstones=A0=A0rows=A0be into a new order= ed sstable ?
=A0
=A0
=A0
On Wed, Dec 1, 2010 at 7:30 PM, Sylvain Lebresne <= ;sylvain@yakaz.com> wrot= e:
On Wed, Dec 1, 2010 at 12:11 PM, Ying Tang <ivytang0812@gmail.com> wrote:
>= And i have another question , what's the difference between minor
> compaction and major compaction?

A major compaction is a = compaction that compact *all* the SSTables of a given
column family (com= paction compacts one CF at a time).

Before https://issues.ap= ache.org/jira/browse/CASSANDRA-1074
(introduced in 0.6.6 and
recent 0.7 betas/rcs), major compactions where = the only ones that removed the
tombstones (see http://wiki.apache= .org/cassandra/DistributedDeletes)
and this is the
reason major compaction exists. Now, with #1074, minor c= ompactions
should remove most
if not all tombstones, so major compact= ion are not or much less useful
(it may depend on your
workload thoug= h as minor can't always delete the tombstones).

--
Sylvain

>
> On 12/1/10, Chen Xinli <chen.daqi@gmail.com> wrote:
>> 20= 10/12/1 Ying Tang <ivytang0812@= gmail.com>
>>
>>> Every time cassandra creates a new sstable , it wi= ll call the
>>> CompactionManager.submitMinorIfNeeded =A0? And = if the number of memtables is
>>> beyond =A0MinimumCompactionTh= reshold =A0, the minor compaction will be called.
>>> And there is also a method named CompactionManager.submitMajor= , and the
>>> call relationship is :
>>>
>&g= t;> NodeCmd -- > NodeProbe -->StorageService.forceTableCompaction = -->
>>> Table.forceCompaction -->CompactionManager.performMajor --&= gt;
>>> CompactionManager.submitMajor
>>>
>&g= t;> ColumnFamilyStore.forceMajorCompaction --> CompactionManager.perf= ormMajor
>>> --> CompactionManager.submitMajor
>>>
>&g= t;>
>>> HintedHandOffManager
>>> =A0--> Compa= ctionManager.submitMajor
>>>
>>> So i have 3 questi= ons:
>>> 1. Once a new sstable has been created ,
>>> Compa= ctionManager.submitMinorIfNeeded =A0will be called , minorCompaction
>= ;>> maybe called .
>>> =A0 =A0 But when will the majorCom= paction be called ? Just the NodeCmd ?
>>>
>>
>> Yes, majorCompaction must be called ma= nually from NodeCmd
>>
>>
>>> 2. Which jobs w= ill minorCompaction and majorCompaction do ?
>>> =A0 =A0 Will m= inorCompaction delete the data that have been marked as deleted
>>> ?
>>> =A0 =A0 And how about the major compaction ?=
>>>
>>
>> Compaction only mark sstables as d= eleted. Deletion will be done when there
>> are full gc, or node r= estarted.
>>
>>
>>> 3. When gc be called ? Every time comp= action been called?
>>>
>>
>> GC has nothing = to do with compaction, you may mistake the two conceptions
>>
>>
>>>
>>>
>>>
>>> --=
>>> Best regards,
>>>
>>> Ivy Tang
= >>>
>>>
>>>
>>>
>> >>
>> --
>> Best Regards,
>> Chen Xinli>>
>
>
> --
> Best regards,
>
> = Ivy Tang
>



--
Best regards,

Ivy Tang



--002215048103f0c8ed04965907b2--