hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ANKUR GOEL <ankur.g...@corp.aol.com>
Subject Data types for Map key value pairs
Date Thu, 15 Jan 2009 14:34:48 GMT
Hi All,
           I have a custom loader that returns a set of fields after 
reading a log line. One of the fields returned is of type DataType.Map. 
My question is how can I set the data types for this map's (key, value) 
pair. In my script I try to generate a record from k,v of this map and 
get the error

java.io.IOException: Unknown type Unknown
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:178)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

here is my script
raw =  LOAD 'myfile' USING myUdf.MyCustomLoader() ;
filtered = FILTER raw BY (ARG_MAP#'key' is not null);
entry = FOREACH filtered GENERATE A, B, myUdf.MySplit(ARG_MAP#'key', 
'|') as FIELDS; // This returns a map with String (key, value) pairs
// The MySplit UDF line splits the value in the map which is "|" 
separated and puts the splits it into another Map and returns it. Each 
split is keyed by 'field0', 'field1'...'fieldn' where n is the number of 

result = FOREACH entry GENERATE A, B,  FIELDS#'field0' as CLIENT_ID, 
FIELDS#'field1' as CHANNEL_ID, FIELDS#'field2' as OTHER_ID;
// Here another tuple is generated

store results into 'location' using PigStorage();

Any help here is appreciated.


View raw message