hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ed Serrano <ed.serr...@gmail.com>
Subject Re: webhdfs read error after successful pig job
Date Fri, 14 Jun 2013 16:49:50 GMT
You might want to investigate if your issue is aways on the same node.

On Fri, Jun 14, 2013 at 11:43 AM, Adam Silberstein <adam@trifacta.com>wrote:

> Hi,
> I'm having some trouble with webhdfs read after running a Pig job that
> completed successfully.
> Here are some details:
> -I am using Hadoop CDH-4.1.3 and the compatible Pig that goes with this
> (0.10.0 I think)
> -The Pig job writes out about 10 files.  I'm programmatically attempting
> to read each of these with webhdfs soon after pig notifies me the job is
> complete.  The reads often all succeed.  And even in the failure case, most
> of the reads still succeed, but one may fail.
> -I wondered if I was facing a race condition where Pig was reporting
> success before the file was truly ready to read.  However, when I run
> WebHDFS read with curl even hours later, the request hangs.  In contrast, I
> can run 'cat' from the DFS command line and the file is output correctly.
> -I ran fsck over the problem file and it report back totally normal.
> -I looked at the namenode to see why my curl request hangs.  I get this
> error:
> ERROR org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException as:ubuntu (auth:SIMPLE)
> cause:java.io.IOException: Could not reach the block containing the data.
> Please try again
> (I'm guessing the permissions aren't really the important thing here, the
> underlying cause of not reaching the block seems more reasonable).
> -I have a 4 node cluster with replication set to 1.
> If anyone has seen this, has diagnostic tips, or best of all, a solution,
> please let me know!
> Thanks,
> Adam


*Ed Serrano*
Mobile: 972-897-5443

View raw message