accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3140) Compaction did not run during RW test
Date Thu, 18 Sep 2014 16:43:34 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139148#comment-14139148
] 

Christopher Tubbs commented on ACCUMULO-3140:
---------------------------------------------

I think this can be easily done without a change in the RPC.

If {{minorCompactionInProgress == true}}, the {{compactAll}} just needs to store a field,
{{lastFlushIdForHeldCompaction?}}, in the Tablet that remembers the flushId currently in progress.
To avoid the starvation concern, it only needs to wait until that specific one is finished
(incremented) before it can proceed. It doesn't need to wait on all in-progress minor compactions
(the solution above which can cause starvation), nor does it need to wait on a specific global
one (which requires the RPC change).

It just needs to wait on, at most, the current one that is in progress. I think that'd be
significantly better than the above solution, and changing all the RPC code (even for 1.7.0).

(Also, as I mentioned to [~kturner] yesterday, the first condition, "Tablet has no files"
is not a requirement for this bug to manifest. It could just as easily manifest, say, by compacting
only 3 out of an expected 4 files, if the fourth file had just finished being flushed.)

> Compaction did not run during RW test
> -------------------------------------
>
>                 Key: ACCUMULO-3140
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3140
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.6.0
>         Environment: 1.5.2 RC1, Hadoop 2.3.0, Zookeeper 3.4.5, CentOS 6, 20 node EC2
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>             Fix For: 1.5.3, 1.6.1, 1.7.0
>
>
> Saw the following failure while running RW test against 1.5.2 RC1 
> {noformat}
> java.lang.Exception: Error running node Shard.xml
>         at org.apache.accumulo.test.randomwalk.Module.visit(Module.java:285)
>         at org.apache.accumulo.test.randomwalk.Framework.run(Framework.java:63)
>         at org.apache.accumulo.test.randomwalk.Framework.main(Framework.java:122)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.accumulo.start.Main$1.run(Main.java:107)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.Exception: Error running node Verify
>         at org.apache.accumulo.test.randomwalk.Module.visit(Module.java:285)
>         at org.apache.accumulo.test.randomwalk.Module.visit(Module.java:254)
>         ... 8 more
> Caused by: java.lang.Exception: index rebuild mismatch 000050 100z:bda1000000000000 []
1410899561685 false 000050 100z:9d20000000000000 [] 1410892435393 false ST_index_ip_10_1_2_29_ec2_internal_3328_1410892364707
ST_index_ip_10_1_2_29_ec2_internal_3328_1410892364707_tmp
>         at org.apache.accumulo.test.randomwalk.shard.VerifyIndex.visit(VerifyIndex.java:55)
>         at org.apache.accumulo.test.randomwalk.Module.visit(Module.java:254)
>         ... 9 more
> {noformat}
> Determined that document ID {{9d20000000000000}} existed in the index, but not the document
table.  I found in the RW logs that a filtering compaction with the pattern {noformat}^[0-9a-f][d].*{noformat}
should have removed this document from the index.  However, the compaction did not run on
the relevant tablet {{1w;000050;00004c}}.   The test shortly after ran a filtering compaction
with the pattern {noformat}^[0-9a-f][1].*{noformat}, which did cause a corresponding compaction.
 Below are the tserver and RW logs interleaved by time.  Document {{9d20000000000000}} was
indexed in shard {{000050}}.
> {noformat}
> TSERVER 2014-09-16 18:32:50,125 [tabletserver.Tablet] TABLET_HIST: 1w<;00004c split
1w;000050;00004c 1w<;000050
> TSERVER 2014-09-16 18:32:50,126 [tabletserver.Tablet] TABLET_HIST: 1w;000050;00004c opened

> TSERVER 2014-09-16 18:32:57,288 [tabletserver.TabletServer] INFO : Adding 1 logs for
extent 1w;000050;00004c as alias 187
> RWLOG   16 18:33:55,294 [shard.Insert] DEBUG: Inserted document 9d20000000000000
> TSERVER 2014-09-16 18:35:02,985 [tabletserver.MinorCompactor] DEBUG: Begin minor compaction
/accumulo/tables/1w/t-00001mf/F0000476.rf_tmp 1w;000050;00004c
> TSERVER 2014-09-16 18:35:04,049 [tabletserver.Compactor] DEBUG: Compaction 1w;000050;00004c
83,164 read | 81,599 written | 128,936 entries/sec |  0.645 secs
> TSERVER 2014-09-16 18:35:04,053 [tabletserver.Tablet] DEBUG: Logs for memory compacted:
1w;000050;00004c 10.1.2.26+9997/1bf8ebed-e73e-460b-b54f-0b29b3d3c19c
> TSERVER 2014-09-16 18:35:04,501 [tabletserver.Tablet] TABLET_HIST: 1w;000050;00004c MinC
[memory] -> /t-00001mf/F0000476.rf
> TSERVER 2014-09-16 18:35:04,501 [tabletserver.Tablet] DEBUG: MinC finish lock 0.00 secs
1w;000050;00004c
> RWLOG   16 18:35:14,641 [shard.CompactFilter] DEBUG: Filtered documents using compaction
iterators ^[0-9a-f][d].* 32451 19802
> TSERVER 2014-09-16 18:35:41,433 [tabletserver.Tablet] DEBUG: Starting MajC 1w;000050;00004c
(USER) [/t-00001mf/F0000476.rf] --> /t-00001mf/A000048e.rf_tmp  [name:RegExFilter, priority:21,
class:org.apache.accumulo.core.iterators.user.RegExFilter, properties:{matchSubstring=false,
negate=true, colqRegex=^[0-9a-f][1].*, orFields=false}]
> TSERVER 2014-09-16 18:35:41,960 [tabletserver.Compactor] DEBUG: Compaction 1w;000050;00004c
81,599 read | 73,110 written | 187,583 entries/sec |  0.435 secs
> TSERVER 2014-09-16 18:35:42,079 [tabletserver.Tablet] TABLET_HIST: 1w;000050;00004c MajC
[/t-00001mf/F0000476.rf] --> /t-00001mf/A000048e.rf
> RWLOG   16 18:35:43,854 [shard.CompactFilter] DEBUG: Filtered documents using compaction
iterators ^[0-9a-f][1].* 18648 10103
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message