accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-2964) Unexpected ThriftSecurityException from BatchScanner
Date Tue, 01 Jul 2014 15:19:24 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048958#comment-14048958
] 

Josh Elser commented on ACCUMULO-2964:
--------------------------------------

Son of a gun, I just saw this again last night. This time, it was from a Scanner in the JUnit
code. The test saw the error after the first tserver had died completely, but before the tserver
logged that it was starting.

Client error:
{noformat}
java.lang.RuntimeException: org.apache.accumulo.core.client.AccumuloSecurityException: Error
DEFAULT_SECURITY_ERROR for user root - Unknown security exception
	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startScan_result$startScan_resultStandardScheme.read(TabletClientService.java:6548)
	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startScan_result$startScan_resultStandardScheme.read(TabletClientService.java:6525)
	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startScan_result.read(TabletClientService.java:6448)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startScan(TabletClientService.java:228)
	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startScan(TabletClientService.java:204)
	at org.apache.accumulo.core.client.impl.ThriftScanner.getBatchFromServer(ThriftScanner.java:99)
	at org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablet(MetadataLocationObtainer.java:100)
	at org.apache.accumulo.core.client.impl.TabletLocatorImpl.lookupTabletLocation(TabletLocatorImpl.java:465)
	at org.apache.accumulo.core.client.impl.TabletLocatorImpl._locateTablet(TabletLocatorImpl.java:622)
	at org.apache.accumulo.core.client.impl.TabletLocatorImpl.locateTablet(TabletLocatorImpl.java:440)
	at org.apache.accumulo.core.client.impl.TabletLocatorImpl.lookupTabletLocation(TabletLocatorImpl.java:462)
	at org.apache.accumulo.core.client.impl.TabletLocatorImpl._locateTablet(TabletLocatorImpl.java:622)
	at org.apache.accumulo.core.client.impl.TabletLocatorImpl.locateTablet(TabletLocatorImpl.java:440)
	at org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:226)
	at org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:84)
	at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:177)
	at org.apache.accumulo.test.replication.MultiInstanceReplicationIT.dataReplicatedToCorrectTable(MultiInstanceReplicationIT.java:375)
{noformat}

Error out of the Scanner:
{noformat}
2014-07-01 08:40:59,251 [impl.ThriftScanner] WARN : Security Violation in scan request to
host:58142: ThriftSecurityException(user:root, code:null)
{noformat}

Snippet from newly started tserver log (note that the above warning from the scanner came
before the tserver said it started the thrift server):
{noformat}
2014-07-01 08:40:59,181 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKAuthorizor
2014-07-01 08:40:59,184 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKAuthenticator
2014-07-01 08:40:59,187 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKPermHandler
2014-07-01 08:40:59,321 [conf.Property] DEBUG: Loaded class : org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager
2014-07-01 08:40:59,324 [tserver.TabletServer] INFO : Tablet server starting on 0.0.0.0
2014-07-01 08:40:59,339 [util.FileSystemMonitor] INFO : Filesystem monitor started
2014-07-01 08:40:59,343 [server.GarbageCollectionLogger] DEBUG: gc ParNew=0.03(+0.03) secs
ConcurrentMarkSweep=0.04(+0.04) secs freemem=42,054,712(+42,054,712) totalmem=47,579,136
2014-07-01 08:40:59,347 [trace.ZooTraceClient] DEBUG: Scanning trace hosts in zookeeper: /accumulo/bc72352d-a904-4102-8ab1-e536aea49c01/tracers
2014-07-01 08:40:59,347 [trace.ZooTraceClient] DEBUG: Trace hosts: []
2014-07-01 08:40:59,368 [tserver.TabletServer] DEBUG: org.apache.accumulo.tserver.TabletServer$ThriftClientHandler
created
2014-07-01 08:40:59,519 [tserver.TabletServer] INFO : address = host:53325
2014-07-01 08:40:59,544 [tserver.TabletServer] DEBUG: Obtained tablet server lock /accumulo/bc72352d-a904-4102-8ab1-e536aea49c01/tservers/host:53325/zlock-0000000000
2014-07-01 08:40:59,572 [tserver.TabletServer] INFO : Started replication service on host:58591
{noformat}

I'm actually wondering now if this is just something due to the thrift-0.9.1 change?

> Unexpected ThriftSecurityException from BatchScanner
> ----------------------------------------------------
>
>                 Key: ACCUMULO-2964
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2964
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client, tserver
>            Reporter: Josh Elser
>            Priority: Minor
>             Fix For: 1.7.0
>
>
> This is something I've only seen a handful of times when writing/running tests that stop
and restart tservers. After the tserver is restarted, there is a thread (typically running
in the master) which is trying to read a table. As such, the thread will continue to poll
until the tserver comes up.
> Very infrequently, the client gets a {{ThriftSecurityException}} with a code of {{DEFAULT_SECURITY_ERROR}}
and a message of {{Unknown security exception}}. There is no additional information in the
client log (from the thrift call inside the batchscanner), and the tserver contains no error
messages at all.
> The error that the client saw.
> {noformat}
> 2014-07-01 04:18:18,971 [impl.TabletServerBatchReaderIterator] DEBUG: Server : host:58090
msg : null
> ThriftSecurityException(user:!SYSTEM, code:null)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10045)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10022)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result.read(TabletClientService.java:9961)
>         at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:313)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:293)
>         at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:632)
>         at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:592)
>         at org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablets(MetadataLocationObtainer.java:181)
>         at org.apache.accumulo.core.client.impl.TabletLocatorImpl.processInvalidated(TabletLocatorImpl.java:667)
>         at org.apache.accumulo.core.client.impl.TabletLocatorImpl.binRanges(TabletLocatorImpl.java:337)
>         at org.apache.accumulo.core.client.impl.TabletLocatorImpl.processInvalidated(TabletLocatorImpl.java:660)
>         at org.apache.accumulo.core.client.impl.TabletLocatorImpl._locateTablet(TabletLocatorImpl.java:610)
>         at org.apache.accumulo.core.client.impl.TabletLocatorImpl.locateTablet(TabletLocatorImpl.java:440)
>         at org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:226)
>         at org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:84)
>         at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:177)
>         at org.apache.accumulo.master.replication.DistributedWorkQueueWorkAssigner.createWork(DistributedWorkQueueWorkAssigner.java:161)
>         at org.apache.accumulo.master.replication.DistributedWorkQueueWorkAssigner.assignWork(DistributedWorkQueueWorkAssigner.java:140)
>         at org.apache.accumulo.master.replication.WorkDriver.run(WorkDriver.java:97)
> {noformat}
> The interesting part is that when the client saw this message, the new TabletServer was
already started, and the old tabletserver appears to have been dead for 20s. So, the client
in the master had been polling for 20s getting a ConnectException (connection refused) which
is expected. I don't know why we got this exception after a length of time.
> The infrequency in which I see this makes me wonder if the random ports in the new tabletserver
are somehow re-grabbing the old tserver's thrift client service port and something is unexpectedly
being interpreted as this ThriftSecurityException? That's the only thing that seems remotely
possible to me. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message