pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jiang licht <licht_ji...@yahoo.com>
Subject Re: Custom load function?
Date Wed, 10 Mar 2010 03:56:25 GMT
Before I read the example, here's a simple thing that I want to know how to implement but
not sure: I have a list of files which are scattered in different folders in a hadoop cluster,
instead of firing multiple "load" to read each file, I want to put the full path names of
these files on a list and then have a load function that can take the file name of the list
as an argument and then load these files ...

Thanks,

Michael

--- On Tue, 3/9/10, jiang licht <licht_jiang@yahoo.com> wrote:


From: jiang licht <licht_jiang@yahoo.com>
Subject: Re: Custom load function?
To: pig-user@hadoop.apache.org
Date: Tuesday, March 9, 2010, 8:10 PM


Thanks Dmitriy. I will read the example and see if I still have questions.

Thanks,

Michael

--- On Tue, 3/9/10, Dmitriy Ryaboy <dvryaboy@gmail.com> wrote:

From: Dmitriy Ryaboy <dvryaboy@gmail.com>
Subject: Re: Custom load function?
To: pig-user@hadoop.apache.org
Date: Tuesday, March 9, 2010, 7:51 PM

I am not sure what your question is.

Your function can work in any way that you want. You don't even have to read
files, you can connect to databases or make up your own data.  See the Hbase
LoadFunc for example. What exactly are you having trouble with?

-D

On Tue, Mar 9, 2010 at 5:43 PM, jiang licht <licht_jiang@yahoo.com> wrote:

> Not just processing different formats or performing pre-processing stuff as
> discussed in pig udf manual. What I want is that a function that can decide
> where to find what files to load and then load those files and generate
> desired tuples (embedded pig is a solution to this in a different way).
>
> Thanks,
>
> Michael
>
>
>



      


      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message