hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Vasudev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6654) Possible NPE in JobHistoryEventHandler#handleEvent
Date Mon, 01 Aug 2016 14:16:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402100#comment-15402100
] 

Varun Vasudev commented on MAPREDUCE-6654:
------------------------------------------

Thanks for the patch [~djp].

1) There's a compilation error in the latest patch. 
2) The current patch loses any events until the {code} setupEventWriter(event.getJobID(),
previousAMStartedEvent); {code} succeeds. It might be useful to keep track of the number and
type of events lost. A simple map of event type to count and then print the map to the log
either at the end of the job or once the call succeeds. What do you think?

> Possible NPE in JobHistoryEventHandler#handleEvent
> --------------------------------------------------
>
>                 Key: MAPREDUCE-6654
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6654
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Xiao Chen
>            Assignee: Junping Du
>            Priority: Critical
>         Attachments: MAPREDUCE-6654-v2.1.patch, MAPREDUCE-6654-v2.patch, MAPREDUCE-6654.patch
>
>
> I have seen NPE thrown from {{JobHistoryEventHandler#handleEvent}}:
> {noformat}
> 2016-03-14 16:42:15,231 INFO [Thread-69] org.apache.hadoop.service.AbstractService: Service
JobHistoryEventHandler failed in state STOPPED; cause: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:570)
> 	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:382)
> 	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
> 	at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
> 	at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
> 	at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
> 	at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1651)
> 	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1147)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:573)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:620)
> {noformat}
> In the version this exception is thrown, the [line|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java#L586]
is:
> {code:java}mi.writeEvent(historyEvent);{code}
> IMHO, this may be caused by an exception in a previous step. Specifically, in the kerberized
environment, when creating event writer which calls to decrypt EEK, the connection to KMS
failed. Exception below:
> {noformat} 
> 2016-03-14 16:41:57,559 ERROR [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler:
Error JobHistoryEventHandler in handleEvent: EventType: AM_STARTED
> java.net.SocketTimeoutException: Read timed out
> 	at java.net.SocketInputStream.socketRead0(Native Method)
> 	at java.net.SocketInputStream.read(SocketInputStream.java:152)
> 	at java.net.SocketInputStream.read(SocketInputStream.java:122)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
> 	at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
> 	at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
> 	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
> 	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
> 	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:520)
> 	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:505)
> 	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:779)
> 	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:185)
> 	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:181)
> 	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94)
> 	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:181)
> 	at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388)
> 	at org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1420)
> 	at org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1522)
> 	at org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1507)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:407)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:400)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:400)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:343)
> 	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:917)
> 	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:898)
> 	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:795)
> 	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.createEventWriter(JobHistoryEventHandler.java:428)
> 	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.setupEventWriter(JobHistoryEventHandler.java:468)
> 	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:553)
> 	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$1.run(JobHistoryEventHandler.java:326)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}
> We should better handle this scenario and not throw an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message