hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kim Chew <kchew...@gmail.com>
Subject Re: data get truncated.
Date Mon, 10 Mar 2014 20:39:48 GMT
Thanks Szehon. My mine is stored as a SEQUENCEFILE, not TEXTFILE.

Kim


On Mon, Mar 10, 2014 at 1:25 PM, Szehon Ho <szehon@cloudera.com> wrote:

> No there is no ignoring of key, you can declare a different key column if
> you dont want it to be in your 'value'.  Say if you want to create a table
> with two fields separated by some separator (say '\t' in your case?), then
> you would do:
>
> CREATE TABLE TEST(key INT, value STRING) ROW FORMAT DELIMITED FIELDS
> TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
>
> Exact DDL details are at:
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL.
>
> Hope that helps.
> Szehon
>
>
>
>
> On Mon, Mar 10, 2014 at 12:38 PM, Kim Chew <kchew534@gmail.com> wrote:
>
>> So I have generated my input file in SequenceFile format like this
>>      <Length of value>     <value+"\n">
>> <Length of value> is typed IntWritable
>> <value+"\n"> is typed Text
>>
>> For example,
>> 167     1105|11748184969223627771|172.31.2.71|0|sta1|...
>>
>> And I create my table like this,
>>
>> CREATE TABLE if not exists KIM_TEST_SEQ (
>>     value string)
>> ROW FORMAT DELIMITED LINES TERMINATED BY '\n'
>> STORED AS SEQUENCEFILE;
>>
>> If I understand correctly, <Length of value> is the key and is ignored
>> when reading and <value> is the row object, however <value> is still
being
>> truncated. Is my schema correct?
>>
>> Thanks.
>>
>> Kim
>>
>>
>> On Fri, Mar 7, 2014 at 5:53 PM, Szehon Ho <szehon@cloudera.com> wrote:
>>
>>> Hi, did you try specifying row, field delimiter on create table ?
>>>
>>> Thanks
>>> Szehon
>>>
>>>
>>>
>>> On Fri, Mar 7, 2014 at 5:27 PM, Kim Chew <kchew534@gmail.com> wrote:
>>>
>>>> I have an input file in Sequence File format which has the format,
>>>>      <length of value><value>
>>>> which has the type <IntWritable><Text>
>>>>
>>>> Then I created a table,
>>>>
>>>> CREATE TABLE if not exists TEST (
>>>>     value string)
>>>> STORED AS SEQUENCEFILE;
>>>>
>>>> and then I load the input file to the table. However when I do a query,
>>>>
>>>> select value from TEST;
>>>>
>>>> I found that 'value is truncated. For example, the value in my input
>>>> file is
>>>>
>>>>
>>>> 14031|11748184969223627771|172.31.2.71|0|sta1|1365546305|976912181|10.196.121.204|172.26.4.10|HTTP|NURL|0|1|1|-420|1|PST|PDT|
>>>>
>>>> what returned from the query is,
>>>>
>>>> 031|11748184969223627771|172.31.2.71|0|sta1|13655463
>>>>
>>>> What went wrong?
>>>>
>>>> TIA
>>>>
>>>> Kim
>>>>
>>>>
>>>
>>
>

Mime
View raw message