hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elliot West (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-11073) ORC FileDump utility ignore errors when writing output
Date Mon, 22 Jun 2015 16:57:00 GMT
Elliot West created HIVE-11073:
----------------------------------

             Summary: ORC FileDump utility ignore errors when writing output
                 Key: HIVE-11073
                 URL: https://issues.apache.org/jira/browse/HIVE-11073
             Project: Hive
          Issue Type: Bug
          Components: Hive
    Affects Versions: 1.2.0
            Reporter: Elliot West
            Assignee: Elliot West
            Priority: Minor


The Hive command line provides the {{--orcfiledump}} utility for dumping data contained within
ORC files, specifically when using the {{-d}} option. Generally, it is useful to be able to
pipe the data extracted into other commands and utilities to transform and control the data
so that it is more manageable by the CLI user. A classic example is {{less}}.

When such command pipelines are currently constructed, the underlying implementation in {{org.apache.hadoop.hive.ql.io.orc.FileDump#printJsonData}}
is oblivious to errors occurring when writing to its output stream. Such errors are common
place when a user issues {{Ctrl+C}} to kill the leaf process. In this event the leaf process
terminates immediately but the Hive CLI process continues to execute until the full contents
of the ORC file has been read.

By making {{FileDump}} considerate of output stream errors the process will terminate as soon
as the destination process exits (i.e. when the user kills {{less}}) and control will be returned
to the user as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message