Mailing-List: contact hive-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hive-dev@hadoop.apache.org
Message-ID: <1609387958.1262286209392.JavaMail.jira@brutus.apache.org>
Date: Thu, 31 Dec 2009 19:03:29 +0000 (UTC)
From: "Namit Jain (JIRA)" <jira@apache.org>
To: hive-dev@hadoop.apache.org
Subject: [jira] Created: (HIVE-1023) typedbytes: datatypes should be derived
 from data
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

typedbytes: datatypes should be derived from data
-------------------------------------------------

                 Key: HIVE-1023
                 URL: https://issues.apache.org/jira/browse/HIVE-1023
             Project: Hadoop Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Namit Jain
            Assignee: Namit Jain


FROM (
FROM src
SELECT TRANSFORM(src.key, src.value) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'
RECORDWRITER 'org.apache.hadoop.hive.contrib.util.typedbytes.TypedBytesRecordWriter'
USING '/bin/cat'
AS (tkey, tvalue) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'
RECORDREADER 'org.apache.hadoop.hive.contrib.util.typedbytes.TypedBytesRecordReader'
) tmap
INSERT OVERWRITE TABLE dest1 SELECT tkey, tvalue;

The output is interpreted as a string - however, it is assumed that the script is retuning string data.
It would be useful if the reader and the deserializer can be decoupled.
The record reader (TypedBytesRecordReader) will read the typed data (independent of the output schema)
and then convert it according to the output schema. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.