pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zach Bailey <zach.bai...@dataclip.com>
Subject Re: Eval UDF passing parameters
Date Tue, 07 Dec 2010 19:47:50 GMT

 You can pass parameters via the UDF constructor. For example:

public MyUDF(boolean includeAge, boolean includeGender)

then you would initialize it like so in your pig script:

define MY_UDF_ONLY_AGE com.package.MyUDF(true, false)

and use it like:

data_with_age = FOREACH data GENERATE user_id, MY_UDF_ONLY_AGE(user_id);


On Tuesday, December 7, 2010 at 2:44 PM, Dexin Wang wrote:

> Hi,
> This might be a dumb question. Is it possible to pass anything other than
> the input tuple to a UDF Eval function?
> Basically in my UDF, I need to do some user info lookup. So the input will
> be:
> (userid,f1,f2)
> with this UDF, I want to convert it to something like
> (userid,age,gender,location,f1,f2)
> where in the UDF I do a DB lookup on the userid and returns user's info
> (age, gender, etc). But I don't necessarily want to pass back the same user
> info fields, e.g. sometimes I only want age.
> I hope there is a way for me to tell the UDF that I only want "age", and
> sometimes "age, location", etc.
> What's the best way to achieve this without having to write a separate UDF
> for every case?
> Thanks.
> Dexin

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message