hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-360) Generalize the FileFormat Interface in Hive
Date Tue, 24 Mar 2009 08:11:53 GMT
Generalize the FileFormat Interface in Hive
-------------------------------------------

                 Key: HIVE-360
                 URL: https://issues.apache.org/jira/browse/HIVE-360
             Project: Hadoop Hive
          Issue Type: Improvement
            Reporter: Zheng Shao


Currently the FileFormat support in Hive is not generalized - we do "if ... else" to support
TextFileFormat and SequenceFileFormat. There is no way to support a 3rd one without changing
the "if...else" structure. We should make an interface for the FileFormat need for Hive.

The OutputFileFormat interface that Hive requires will contain one more method than the Hadoop
OutputFileFormat - create a File with a specific name.

Hive.g:409 (Hive.g already supports the custom file format but DDLSemanticAnalyzer.java is
not recognizing it yet
{code}
KW_STORED KW_AS KW_INPUTFORMAT inFmt=StringLiteral KW_OUTPUTFORMAT outFmt=StringLiteral
{code}

Please add the handling of TOK_TABLEFILEFORMAT here:
DDLSemanticAnalyzer.java:223
{code}
        case HiveParser.TOK_TBLSEQUENCEFILE:
        ...
{code}

Please add the handling of custom outputFormat here by adding a new interface (and cast the
user-provided file format to that interface), instead of doing "if ... else"
FileSinkOperator.java:129-174:
{code}
      if(outputFormat instanceof IgnoreKeyTextOutputFormat) {
        finalPath = new Path(Utilities.toTempPath(conf.getDirName()), Utilities.getTaskId(hconf)
+
                             Utilities.getFileExtension(jc, isCompressed));
      ...
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message