accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ACCUMULO-3096) Scans stuck and seeing error message about constraint violation
Date Fri, 29 Aug 2014 20:31:53 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eric Newton updated ACCUMULO-3096:
----------------------------------

    Summary: Scans stuck and seeing error message about constraint violation  (was: Scans
stuck and seeing error message about contratint violation)

> Scans stuck and seeing error message about constraint violation
> ---------------------------------------------------------------
>
>                 Key: ACCUMULO-3096
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3096
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.6.0
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>             Fix For: 1.6.1, 1.7.0
>
>
> Just helped someone debug an issue. Their scans were getting stuck on a certain tserver
(determined tserver by turning on debug in shell).  On the tserver, there was a contant stream
of messages about a metadata table contstraint violate because {{Bulk load transaction no
longer running}}.
> The following code in {{Tablet.importMapFiles()}} 
> {code:java}
>           synchronized (timeLock) {
>             if (bulkTime > persistedTime)
>               persistedTime = bulkTime;
>             MetadataTableUtil.updateTabletDataFile(tid, extent, paths, tabletTime.getMetadataValue(persistedTime),
creds, tabletServer.getLock());
>           }
> {code}
> Ended up calling the following code in {{MetadataTableUtil}}.  
> {code:java}
> public static void update(Credentials credentials, ZooLock zooLock, Mutation m, KeyExtent
extent) {
>     Writer t = extent.isMeta() ? getRootTable(credentials) : getMetadataTable(credentials);
>     if (zooLock != null)
>       putLockID(zooLock, m);
>     while (true) {
>       try {
>         t.update(m);
>         return;
>       } catch (AccumuloException e) {
>         log.error(e, e);
>       } catch (AccumuloSecurityException e) {
>         log.error(e, e);
>       } catch (ConstraintViolationException e) {
>         log.error(e, e);
>       } catch (TableNotFoundException e) {
>         log.error(e, e);
>       }
>       UtilWaitThread.sleep(1000);
>     }
>   }
> {code}
> So when the constraint failed, it retried forever.   It did this while holding timeLock,
which in turn prevented compactions from completing, which eventually gummed up scans.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message