cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vijay (Issue Comment Edited) (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
Date Fri, 23 Dec 2011 22:32:30 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175595#comment-13175595
] 

Vijay edited comment on CASSANDRA-3623 at 12/23/11 10:30 PM:
-------------------------------------------------------------

Hot Methods before the patch (trunk, without any patch):
Excl. User CPU    Name

   sec.      %
1480.474 100.00   <Total>
756.717  51.11   crc32
387.767  26.19   <static>@0x54999 (<snappy-1.0.4.1-libsnappyjava.so>)
 54.814   3.70   org.apache.cassandra.io.compress.CompressedRandomAccessReader.<init>(java.lang.String,
org.apache.cassandra.io.compress.CompressionMetadata, boolean)
 46.676   3.15   org.apache.cassandra.io.util.RandomAccessReader.<init>(java.io.File,
int, boolean)
 45.697   3.09   Copy::pd_disjoint_words(HeapWord*, HeapWord*, unsigned long)
 39.417   2.66   memcpy
 36.931   2.49   <static>@0xd8e9 (<libpthread-2.5.so>)
 23.272   1.57   CompactibleFreeListSpace::block_size(const HeapWord*) const
 22.766   1.54   SpinPause
 12.593   0.85   BlockOffsetArrayNonContigSpace::block_start_unsafe(const void*) const
  9.304   0.63   CardTableModRefBSForCTRS::card_will_be_scanned(signed char)
  8.468   0.57   CardTableModRefBS::non_clean_card_iterate_work(MemRegion, MemRegionClosure*,
bool)
  8.051   0.54   ParallelTaskTerminator::offer_termination(TerminatorTerminator*)
  5.400   0.36   madvise
  4.619   0.31   CardTableModRefBS::process_chunk_boundaries(Space*, DirtyCardToOopClosure*,
MemRegion, MemRegion, signed char**, unsigned long, unsigned long)
  1.584   0.11   CardTableModRefBS::dirty_card_range_after_reset(MemRegion, bool, int)
  1.551   0.10   SweepClosure::do_blk_careful(HeapWord*)


Hot Methods After the patch:
sec.      %
537.681 100.00   <Total>
529.719  98.52   <static>@0x54999 (<snappy-1.0.4.1-libsnappyjava.so>)
4.168   0.78   memcpy
0.143   0.03   <Unknown>
0.121   0.02   send
0.121   0.02   sun.misc.Unsafe.park(boolean, long)
0.110   0.02   sun.misc.Unsafe.unpark(java.lang.Object)
0.088   0.02   Interpreter
0.077   0.01   org.apache.cassandra.utils.EstimatedHistogram.max()
0.077   0.01   recv
0.066   0.01   SpinPause
0.055   0.01   org.apache.cassandra.utils.EstimatedHistogram.mean()
0.044   0.01   java.lang.Object.wait(long)
0.044   0.01   org.apache.cassandra.utils.EstimatedHistogram.min()
0.044   0.01   __pthread_cond_signal
0.044   0.01   vtable stub
0.033   0.01   java.lang.Object.notify()
0.033   0.01   java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable)
0.033   0.01   org.apache.cassandra.io.compress.CompressedMappedFileDataInput.read()
0.033   0.01   PhaseLive::compute(unsigned)
0.033   0.01   poll
0.022   0.00   Arena::contains(const void*) const
0.022   0.00   CompactibleFreeListSpace::free() const
0.022   0.00   I2C/C2I adapters
0.022   0.00   IndexSetIterator::advance_and_next()
0.022   0.00   java.lang.Class.forName0(java.lang.String, boolean, java.lang.ClassLoader)
0.022   0.00   java.lang.Long.getChars(long, int, char[])
0.022   0.00   java.nio.Bits.swap(int)



Before this patch response times (With crc chance set to 0):
Epoch               Rds/s   RdLat       Wrts/s  WrtLat     %user   %sys  %idle   %iowait %steal
 md0r/s  w/s     rMB/s   wMB/s   NetRxKb NetTxKb Percentiles     Read            Write   Compacts
1324587443      15      186.305         0        0.000   27.85  0.02    71.83       0.24 
        0.05    3.89    0.00    0.12    0.00    41      45      99th 545.791 ms 95th 454.826
ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587455      15      1142.712       0        0.000   39.55  0.13    57.61       2.50  
       0.21    118.30  0.30    2.20    0.00    34      36      99th 8409.007 ms        95th
8409.007 ms        99th 0.00 ms    95th 0.00 ms    Pen/0
1324587467      10      171.808         0        0.000   23.83  0.04    76.05       0.04 
         0.05    4.80    0.00    0.14    0.00    127     33      99th 454.826 ms 95th 315.852
ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587478      10      182.775         0        0.000   20.43  0.04    79.47       0.01 
        0.05    1.60    0.40    0.04    0.00    30      37      99th 379.022 ms 95th 379.022
ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587490      13      190.893         0        0.000   27.58  0.03    72.20       0.14 
        0.06    3.20    0.50    0.09    0.00    39      42      99th 545.791 ms 95th 379.022
ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587503      28      358.719         0        0.000   52.24  0.08    46.20       1.40 
        0.09    159.40  0.00    3.16    0.00    196     71      99th 3379.391 ms        95th
943.127 ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587517      13      194.281         0        0.000   16.68  0.02    83.23       0.04 
        0.02    2.40    0.30    0.07    0.00    38      41      99th 785.939 ms 95th 545.791
ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587535      36      662.410         0        0.000   58.34  0.08    41.42       0.06 
        0.10    3.60    0.20    0.11    0.00    173     81      99th 3379.391 ms        95th
2816.159 ms        99th 0.00 ms    95th 0.00 ms    Pen/0
1324587547      22      189.838         0        0.000   37.68  0.05    62.03       0.16 
        0.09    5.32    0.49    0.16    0.00    56      63      99th 454.826 ms 95th 379.022
ms 99th 0.00 ms    95th 0.00 ms    Pen/0

After this patch response times:
Epoch               Rds/s   RdLat      	 Wrts/s  WrtLat     %user   %sys      %idle   %iowait
   %steal  md0r/s  w/s     rMB/s   wMB/s   NetRxKb NetTxKb Percentiles     Read          
 Write   Compacts
1324665227		18	97.724   			0	 0.000	 21.49	0.02	78.40	0.05	0.04	4.00	0.40	0.12	0.00	167	45
99th 152.321 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665239		26	107.279			0	 0.000	 29.57	0.04	70.18	0.16	0.05	8.70	0.00	0.22	0.00	56	60	99th
219.342 ms	95th 152.321 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665251		27	105.965			0	 0.000	 28.37	0.05	70.97	0.54	0.08	6.49	0.60	0.11	0.00	70	73	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665262		21	103.396			0	 0.000	 22.84	0.03	77.08	0.01	0.04	0.80	0.10	0.03	0.00	43	46	99th
126.934 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665274		27	104.916			0	 0.000	 32.78	0.04	67.06	0.06	0.06	7.70	0.10	0.14	0.00	161	64	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665286		21	105.094			0	 0.000	 21.33	0.01	78.53	0.09	0.04	3.49	0.30	0.10	0.00	47	51	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665297		21	104.898			0	 0.000	 22.95	0.01	76.91	0.10	0.03	4.40	0.00	0.12	0.00	46	48	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665309		25	104.844			0	 0.000	 27.31	0.03	72.53	0.09	0.05	4.00	0.60	0.12	0.00	199	71	99th
152.321 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665321		26	106.604			0	 0.000	 32.63	0.05	66.99	0.27	0.06	5.40	0.10	0.11	0.00	54	57	99th
219.342 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665332		21	104.086			0	 0.000	 24.66	0.01	75.19	0.10	0.04	3.30	0.00	0.10	0.00	146	51	99th
152.321 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665344		24	108.079			0	 0.000	 29.26	0.04	70.50	0.15	0.06	3.10	0.40	0.09	0.00	56	59	99th
219.342 ms	95th 152.321 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665356		32	105.465			0	 0.000	 32.67	0.04	66.97	0.25	0.08	8.80	0.00	0.11	0.00	60	63	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665368		15	103.112			0	 0.000	 16.61	0.03	83.33	0.01	0.03	0.80	0.40	0.02	0.00	48	53	99th
126.934 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665379		24	104.599			0	 0.000	 25.87	0.03	74.05	0.01	0.05	2.59	0.10	0.08	0.00	51	54	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0


Looks like we have 50% better performance with this.

Pavel, you are right the biggest gain was because we reduced the memcpy.


                
      was (Author: vijay2win@yahoo.com):
    Hot Methods before the patch (trunk):
Excl. User CPU    Name

   sec.      %
1480.474 100.00   <Total>
756.717  51.11   crc32
387.767  26.19   <static>@0x54999 (<snappy-1.0.4.1-libsnappyjava.so>)
 54.814   3.70   org.apache.cassandra.io.compress.CompressedRandomAccessReader.<init>(java.lang.String,
org.apache.cassandra.io.compress.CompressionMetadata, boolean)
 46.676   3.15   org.apache.cassandra.io.util.RandomAccessReader.<init>(java.io.File,
int, boolean)
 45.697   3.09   Copy::pd_disjoint_words(HeapWord*, HeapWord*, unsigned long)
 39.417   2.66   memcpy
 36.931   2.49   <static>@0xd8e9 (<libpthread-2.5.so>)
 23.272   1.57   CompactibleFreeListSpace::block_size(const HeapWord*) const
 22.766   1.54   SpinPause
 12.593   0.85   BlockOffsetArrayNonContigSpace::block_start_unsafe(const void*) const
  9.304   0.63   CardTableModRefBSForCTRS::card_will_be_scanned(signed char)
  8.468   0.57   CardTableModRefBS::non_clean_card_iterate_work(MemRegion, MemRegionClosure*,
bool)
  8.051   0.54   ParallelTaskTerminator::offer_termination(TerminatorTerminator*)
  5.400   0.36   madvise
  4.619   0.31   CardTableModRefBS::process_chunk_boundaries(Space*, DirtyCardToOopClosure*,
MemRegion, MemRegion, signed char**, unsigned long, unsigned long)
  1.584   0.11   CardTableModRefBS::dirty_card_range_after_reset(MemRegion, bool, int)
  1.551   0.10   SweepClosure::do_blk_careful(HeapWord*)


Hot Methods After the patch:
sec.      %
537.681 100.00   <Total>
529.719  98.52   <static>@0x54999 (<snappy-1.0.4.1-libsnappyjava.so>)
4.168   0.78   memcpy
0.143   0.03   <Unknown>
0.121   0.02   send
0.121   0.02   sun.misc.Unsafe.park(boolean, long)
0.110   0.02   sun.misc.Unsafe.unpark(java.lang.Object)
0.088   0.02   Interpreter
0.077   0.01   org.apache.cassandra.utils.EstimatedHistogram.max()
0.077   0.01   recv
0.066   0.01   SpinPause
0.055   0.01   org.apache.cassandra.utils.EstimatedHistogram.mean()
0.044   0.01   java.lang.Object.wait(long)
0.044   0.01   org.apache.cassandra.utils.EstimatedHistogram.min()
0.044   0.01   __pthread_cond_signal
0.044   0.01   vtable stub
0.033   0.01   java.lang.Object.notify()
0.033   0.01   java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable)
0.033   0.01   org.apache.cassandra.io.compress.CompressedMappedFileDataInput.read()
0.033   0.01   PhaseLive::compute(unsigned)
0.033   0.01   poll
0.022   0.00   Arena::contains(const void*) const
0.022   0.00   CompactibleFreeListSpace::free() const
0.022   0.00   I2C/C2I adapters
0.022   0.00   IndexSetIterator::advance_and_next()
0.022   0.00   java.lang.Class.forName0(java.lang.String, boolean, java.lang.ClassLoader)
0.022   0.00   java.lang.Long.getChars(long, int, char[])
0.022   0.00   java.nio.Bits.swap(int)



Before this patch response times:
Epoch               Rds/s   RdLat       Wrts/s  WrtLat     %user   %sys  %idle   %iowait %steal
 md0r/s  w/s     rMB/s   wMB/s   NetRxKb NetTxKb Percentiles     Read            Write   Compacts
1324587443      15      186.305         0        0.000   27.85  0.02    71.83       0.24 
        0.05    3.89    0.00    0.12    0.00    41      45      99th 545.791 ms 95th 454.826
ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587455      15      1142.712       0        0.000   39.55  0.13    57.61       2.50  
       0.21    118.30  0.30    2.20    0.00    34      36      99th 8409.007 ms        95th
8409.007 ms        99th 0.00 ms    95th 0.00 ms    Pen/0
1324587467      10      171.808         0        0.000   23.83  0.04    76.05       0.04 
         0.05    4.80    0.00    0.14    0.00    127     33      99th 454.826 ms 95th 315.852
ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587478      10      182.775         0        0.000   20.43  0.04    79.47       0.01 
        0.05    1.60    0.40    0.04    0.00    30      37      99th 379.022 ms 95th 379.022
ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587490      13      190.893         0        0.000   27.58  0.03    72.20       0.14 
        0.06    3.20    0.50    0.09    0.00    39      42      99th 545.791 ms 95th 379.022
ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587503      28      358.719         0        0.000   52.24  0.08    46.20       1.40 
        0.09    159.40  0.00    3.16    0.00    196     71      99th 3379.391 ms        95th
943.127 ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587517      13      194.281         0        0.000   16.68  0.02    83.23       0.04 
        0.02    2.40    0.30    0.07    0.00    38      41      99th 785.939 ms 95th 545.791
ms 99th 0.00 ms    95th 0.00 ms    Pen/0
1324587535      36      662.410         0        0.000   58.34  0.08    41.42       0.06 
        0.10    3.60    0.20    0.11    0.00    173     81      99th 3379.391 ms        95th
2816.159 ms        99th 0.00 ms    95th 0.00 ms    Pen/0
1324587547      22      189.838         0        0.000   37.68  0.05    62.03       0.16 
        0.09    5.32    0.49    0.16    0.00    56      63      99th 454.826 ms 95th 379.022
ms 99th 0.00 ms    95th 0.00 ms    Pen/0

After this patch response times:
Epoch               Rds/s   RdLat      	 Wrts/s  WrtLat     %user   %sys      %idle   %iowait
   %steal  md0r/s  w/s     rMB/s   wMB/s   NetRxKb NetTxKb Percentiles     Read          
 Write   Compacts
1324665227		18	97.724   			0	 0.000	 21.49	0.02	78.40	0.05	0.04	4.00	0.40	0.12	0.00	167	45
99th 152.321 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665239		26	107.279			0	 0.000	 29.57	0.04	70.18	0.16	0.05	8.70	0.00	0.22	0.00	56	60	99th
219.342 ms	95th 152.321 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665251		27	105.965			0	 0.000	 28.37	0.05	70.97	0.54	0.08	6.49	0.60	0.11	0.00	70	73	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665262		21	103.396			0	 0.000	 22.84	0.03	77.08	0.01	0.04	0.80	0.10	0.03	0.00	43	46	99th
126.934 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665274		27	104.916			0	 0.000	 32.78	0.04	67.06	0.06	0.06	7.70	0.10	0.14	0.00	161	64	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665286		21	105.094			0	 0.000	 21.33	0.01	78.53	0.09	0.04	3.49	0.30	0.10	0.00	47	51	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665297		21	104.898			0	 0.000	 22.95	0.01	76.91	0.10	0.03	4.40	0.00	0.12	0.00	46	48	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665309		25	104.844			0	 0.000	 27.31	0.03	72.53	0.09	0.05	4.00	0.60	0.12	0.00	199	71	99th
152.321 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665321		26	106.604			0	 0.000	 32.63	0.05	66.99	0.27	0.06	5.40	0.10	0.11	0.00	54	57	99th
219.342 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665332		21	104.086			0	 0.000	 24.66	0.01	75.19	0.10	0.04	3.30	0.00	0.10	0.00	146	51	99th
152.321 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665344		24	108.079			0	 0.000	 29.26	0.04	70.50	0.15	0.06	3.10	0.40	0.09	0.00	56	59	99th
219.342 ms	95th 152.321 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665356		32	105.465			0	 0.000	 32.67	0.04	66.97	0.25	0.08	8.80	0.00	0.11	0.00	60	63	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665368		15	103.112			0	 0.000	 16.61	0.03	83.33	0.01	0.03	0.80	0.40	0.02	0.00	48	53	99th
126.934 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0
1324665379		24	104.599			0	 0.000	 25.87	0.03	74.05	0.01	0.05	2.59	0.10	0.08	0.00	51	54	99th
182.785 ms	95th 126.934 ms	99th 0.00 ms	95th 0.00 ms	Pen/0


Looks like we have 50% better performance with this.

Pavel, you are right the biggest gain was because we reduced the memcpy.


                  
> use MMapedBuffer in CompressedSegmentedFile.getSegment
> ------------------------------------------------------
>
>                 Key: CASSANDRA-3623
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1
>            Reporter: Vijay
>            Assignee: Vijay
>              Labels: compression
>             Fix For: 1.1
>
>         Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch,
0002-tests-for-MMaped-Compression-segmented-file-v2.patch
>
>
> CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the
MMap and hence a higher CPU on the nodes and higher latencies on reads. 
> This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
> // TODO refactor this to separate concept of "buffer to avoid lots of read() syscalls"
and "compression buffer"
> but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message