hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhijit Pol <a...@rocketfuelinc.com>
Subject getting all null values
Date Fri, 18 Sep 2009 21:55:15 GMT
For one of the hive table I switched from TextFile to SequenceFile format.
This is how I created the new table:

CREATE EXTERNAL TABLE IMPRESSIONS ( A STRING, B STRING)
PARTITIONED BY(DATA_DATE STRING COMMENT 'yyyyMMdd (e.g. 20090801) on which
log records are collected')
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS SEQUENCEFILE
LOCATION '/user/hadoop/warehouse/facts/impressions/';

This external table is sourced by our custom ETL job which writes data in
MultipleSequenceFileOutputFormat.

When I issue simple query like: SELECT * FROM IMPRESSIONS;
This is what I am getting for all the records:
NULL    NULL    20090715
NULL    NULL    20090715
NULL    NULL    20090715
....

But if I do: hadoop dfs -text
/user/hadoop/warehouse/facts/impressions/data_date=20090715/* | less
I get expected output.

Previously I was using MultipleTextFileOutputFormat to feed TextFile version
of this table and it worked well.

Any hints?

Thanks,
Abhi

Mime
View raw message