hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh Chauhan <hashut...@apache.org>
Subject Re: How to skip the malformatted records while loading data
Date Wed, 24 Aug 2011 22:09:26 GMT
One possibility is to filter out NULLs, something like following:

hive> select * from tb where id != NULL or pref != NULL or zip != NULL;

This is not most efficient, but will work.

2011/8/18 XieXianshan <xiexs@cn.fujitsu.com>

> Hi,everyone,
>
> Is there an option to ignore malformatted records while loading data
> into hive table?
> Or an option to ignore bad rows while querying data?
>
> For instance:
> 1. Specify a row format explicitly for a new table.
> hive>create table tb (id int, pref string, zip string) row format
> delimited fields terminated by ',' lines terminated by '\n';
>
> 2. Load data into the table from a csv file that with bad records.
> hive>load data local inpath 'data.csv' overwrite into table tb;
>
> The data.csv might look like:
> 32,aaa,4200002
> <--Blank line
> 33:bbb:4200003 <--Invalid field delimiter ":"
> aa,ccc,4200004 <--Non-int number "aa"
>
> 3. Select data
> hive> select * from tb;
> OK
> 32 aaa 4200002
> NULL NULL NULL
> NULL NULL NULL
> NULL ccc 4200004
> Time taken: 0.196 seconds
>
> I have tried to set mapred.skip.map.max.skip.records,but it seems not to
> work.
>
> Thanks in advance.
>
> Regards,
> Xie
>
> --
> Best Regards
> Xie Xianshan
> --------------------------------------------------
> Xie Xianshan
> Dept.IV of Technology and Development
> Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST)
> No. 6 Wenzhu Road, Nanjing, China
> PostCode: 210012
> PHONE: +86+25-86630566-8522
> FUJITSU INTERNAL: 7998-8522
> MAIL: xiexs@cn.fujitsu.com
> --------------------------------------------------
> This communication is for use by the intended recipient(s) only and may
> contain information that is privileged, confidential and exempt from
> disclosure under applicable law. If you are not an intended recipient of
> this communication, you are hereby notified that any dissemination,
> distribution or copying hereof is strictly prohibited.  If you have
> received this communication in error, please notify me by reply e-mail,
> permanently delete this communication from your system, and destroy any
> hard copies you may have printed
>
>

Mime
View raw message