hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Szehon Ho <sze...@cloudera.com>
Subject Re: data get truncated.
Date Mon, 10 Mar 2014 20:25:53 GMT
No there is no ignoring of key, you can declare a different key column if
you dont want it to be in your 'value'.  Say if you want to create a table
with two fields separated by some separator (say '\t' in your case?), then
you would do:

CREATE TABLE TEST(key INT, value STRING) ROW FORMAT DELIMITED FIELDS
TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE;

Exact DDL details are at:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL.

Hope that helps.
Szehon




On Mon, Mar 10, 2014 at 12:38 PM, Kim Chew <kchew534@gmail.com> wrote:

> So I have generated my input file in SequenceFile format like this
>      <Length of value>     <value+"\n">
> <Length of value> is typed IntWritable
> <value+"\n"> is typed Text
>
> For example,
> 167     1105|11748184969223627771|172.31.2.71|0|sta1|...
>
> And I create my table like this,
>
> CREATE TABLE if not exists KIM_TEST_SEQ (
>     value string)
> ROW FORMAT DELIMITED LINES TERMINATED BY '\n'
> STORED AS SEQUENCEFILE;
>
> If I understand correctly, <Length of value> is the key and is ignored
> when reading and <value> is the row object, however <value> is still being
> truncated. Is my schema correct?
>
> Thanks.
>
> Kim
>
>
> On Fri, Mar 7, 2014 at 5:53 PM, Szehon Ho <szehon@cloudera.com> wrote:
>
>> Hi, did you try specifying row, field delimiter on create table ?
>>
>> Thanks
>> Szehon
>>
>>
>>
>> On Fri, Mar 7, 2014 at 5:27 PM, Kim Chew <kchew534@gmail.com> wrote:
>>
>>> I have an input file in Sequence File format which has the format,
>>>      <length of value><value>
>>> which has the type <IntWritable><Text>
>>>
>>> Then I created a table,
>>>
>>> CREATE TABLE if not exists TEST (
>>>     value string)
>>> STORED AS SEQUENCEFILE;
>>>
>>> and then I load the input file to the table. However when I do a query,
>>>
>>> select value from TEST;
>>>
>>> I found that 'value is truncated. For example, the value in my input
>>> file is
>>>
>>>
>>> 14031|11748184969223627771|172.31.2.71|0|sta1|1365546305|976912181|10.196.121.204|172.26.4.10|HTTP|NURL|0|1|1|-420|1|PST|PDT|
>>>
>>> what returned from the query is,
>>>
>>> 031|11748184969223627771|172.31.2.71|0|sta1|13655463
>>>
>>> What went wrong?
>>>
>>> TIA
>>>
>>> Kim
>>>
>>>
>>
>

Mime
View raw message