hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "EOFException" by SteveLoughran
Date Wed, 05 Jun 2013 12:32:27 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "EOFException" page has been changed by SteveLoughran:

add an article on EOFExceptions

New page:
= EOFException =

You can get a EOFException {{{java.io.EOFException}}} in two main ways

== EOFException during FileSystem operations ==

Unless this is caused by a network issue (see below), and {{{EOFException}}} means that the
program working with a file in HDFS or another supported FileSystem has tried to read or seek
beyond the limits of the file.

There is an obvious solution here: don't do that.

== EOFException during Network operations ==

You can see an EOFException during network operations, including RPC calls between applications
talking to HDFS, the JobTracker, YARN services or other Hadoop components. 

It can mean

=== Unexpected Server shutdown ===

The far end of the network link shut down during the RPC operation. 

 1. Verify that the server at the end of the network operation is running -restart it if not.
 1. If the service is an HA component, it may be that failover has occurred -but the client
doesn't detect this and retry its operation. Try restartomg the application.

=== Protocol Mismatch ===

There is some protocol mismatch between client and server which means that the server sent
less data than the client expected. This is rare in the core Hadoop components, as the RPC
mechanisms used versioned protocols precisely to prevent versioning problems.  It is more
likely in a third-party component, or a module talking to a remote filesystem.

 1. Retry the operation, it may work this time.
 1. Look at the stack trace and see if it occurs in a Hadoop class ({{{org.apache.hadoop}}}
-especially an RPC one), or something else.
 1. If it happens in one of the Hadoop remote filesystems (s3, s3n, ftp ...etc.), or in Apache
HTTP libraries, it usually means the far end has finished early. Try again.

== Attention: Developers of RPC clients ==

If your users see this a lot, it implies it is time to make your client applications use the
{{{org.apache.hadoop.io.retry}}} package to make them more resilient to outages.

View raw message