accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3777) Minor compaction fails forever after table deleted
Date Sun, 10 May 2015 03:58:00 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537015#comment-14537015
] 

Josh Elser commented on ACCUMULO-3777:
--------------------------------------

Curious. At a glance, the code looks kosher:

{panel:title=Tablet.java}
{code}
    // close map files
    getTabletResources().close();
{code}
{panel}

This should remove the extent from {{tabletReports}}. If [~kturner] still has the logs around,
this deserves more investigation as to how we got into this case to begin with.

> Minor compaction fails forever after table deleted
> --------------------------------------------------
>
>                 Key: ACCUMULO-3777
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3777
>             Project: Accumulo
>          Issue Type: Bug
>         Environment: Hadoop 2.7.0, ZK 3.4.6, Accumulo 83d1b8388ad807d678c9a3a922e5025faa9a5933,
20 node m3.large EC2 cluster
>            Reporter: Keith Turner
>              Labels: 1.7.0_QA
>             Fix For: 1.7.0
>
>
> Was running RW test and saw an issue where a minor compaction  thread went haywire after
a table was deleted.
> Was continually seeing this exception.
> {noformat}
> 2015-05-06 16:16:35,374 [tserver.TabletServerResourceManager] ERROR: Memory manager failed
Table with id 1l does not exist
> java.lang.IllegalArgumentException: Table with id 1l does not exist
>         at org.apache.accumulo.core.client.impl.Tables.getNamespaceId(Tables.java:239)
>         at org.apache.accumulo.server.conf.TableParentConfiguration.getNamespaceId(TableParentConfiguration.java:38)
>         at org.apache.accumulo.server.conf.NamespaceConfiguration.getPath(NamespaceConfiguration.java:88)
>         at org.apache.accumulo.server.conf.NamespaceConfiguration.get(NamespaceConfiguration.java:101)
>         at org.apache.accumulo.server.conf.ZooCachePropertyAccessor.get(ZooCachePropertyAccessor.java:110)
>         at org.apache.accumulo.server.conf.TableConfiguration.get(TableConfiguration.java:99)
>         at org.apache.accumulo.core.conf.AccumuloConfiguration.getTimeInMillis(AccumuloConfiguration.java:252)
>         at org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager.getMinCIdleThreshold(LargestFirstMemoryManager.java:142)
>         at org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager.getMemoryManagementActions(LargestFirstMemoryManager.java:175)
>         at org.apache.accumulo.tserver.TabletServerResourceManager$MemoryManagementFramework.manageMemory(TabletServerResourceManager.java:408)
>         at org.apache.accumulo.tserver.TabletServerResourceManager$MemoryManagementFramework.access$400(TabletServerResourceManager.java:318)
>         at org.apache.accumulo.tserver.TabletServerResourceManager$MemoryManagementFramework$2.run(TabletServerResourceManager.java:346)
>         at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> From the master logs :
> {noformat}
> 2015-05-06 16:16:35,014 [tableOps.CleanUp] DEBUG: Deleted table 1l
> {noformat}
> It seems this went on for a while until something wacked the tserver
> {noformat}
> [ec2-user@worker5 logs]$ grep 'Memory manager failed Table with id 1l does not exist'
tserver_worker5.log | head -3
> 2015-05-06 16:16:35,123 [tserver.TabletServerResourceManager] ERROR: Memory manager failed
Table with id 1l does not exist
> 2015-05-06 16:16:35,374 [tserver.TabletServerResourceManager] ERROR: Memory manager failed
Table with id 1l does not exist
> 2015-05-06 16:16:35,625 [tserver.TabletServerResourceManager] ERROR: Memory manager failed
Table with id 1l does not exist
> [ec2-user@worker5 logs]$ grep 'Memory manager failed Table with id 1l does not exist'
tserver_worker5.log | tail -3
> 2015-05-06 17:15:06,141 [tserver.TabletServerResourceManager] ERROR: Memory manager failed
Table with id 1l does not exist
> 2015-05-06 17:15:06,392 [tserver.TabletServerResourceManager] ERROR: Memory manager failed
Table with id 1l does not exist
> 2015-05-06 17:15:06,642 [tserver.TabletServerResourceManager] ERROR: Memory manager failed
Table with id 1l does not exist
> [ec2-user@worker5 logs]$ grep "Lost tablet server lock" tserver_worker5.log 
> 2015-05-06 17:15:06,685 [tserver.TabletServer] ERROR: Lost tablet server lock (reason
= LOCK_DELETED), exiting.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message