hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-217) Stream closed exception
Date Tue, 13 Jan 2009 23:26:59 GMT

    [ https://issues.apache.org/jira/browse/HIVE-217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663535#action_12663535
] 

Joydeep Sen Sarma commented on HIVE-217:
----------------------------------------

For the change to get the reporter reference into the operator structure - the interface change
looks good. However - why don't we just store the reporter reference in the base Operator
class rather than the FileSinkOperator specifically? If we run into other cases where we have
to add progress indicators - this will make it easier.

+1 otherwise. 

> Stream closed exception
> -----------------------
>
>                 Key: HIVE-217
>                 URL: https://issues.apache.org/jira/browse/HIVE-217
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>         Environment: Hive from trunk, hadoop 0.18.2, ~20 machines
>            Reporter: Johan Oskarsson
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: HIVE-217.log, HIVE-217.patch
>
>
> When running a query similar to the following:
> "insert overwrite table outputtable select a, b, cast(sum(counter) as INT) from tablea
join tableb on (tablea.username=tableb.username) join tablec on (tablec.userid = tablea.userid)
join tabled on (tablec.id=tabled.id) where insertdate >= 'somedate' and insertdate <=
'someotherdate' group by a, b;"
> Where one table is ~40gb or so and the others are a couple of hundred mb. The error happens
in the first mapred job that processes the 40gb.
> I get the following exception (see attached file for full stack trace):
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Stream
closed.
>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:162)
> It happens in one reduce task and is reproducible, running the same query gives the error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message