hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianyong Dai <jiany...@yahoo-inc.com>
Subject Re: Consider cleaning up backend code
Date Fri, 23 Apr 2010 00:49:25 GMT
+1 for removing. This interface does not bring us any value when we 
decide to move closer to hadoop. Writing a backend is almost writing 
half of Pig. I don't think this interface is attractive to most 
developers. Instead, I +1 for Milind's idea to make intermediate 
artifacts available, or provide some hook for user to peek/morph the 
plan at different stages. This opens the door for developers to 
visualize/debug/improve Pig without knowing every details of Pig.

Daniel

Alan Gates wrote:
> A couple of years ago we had this concept that Pig as is should be  
> able to run on other backends (like say Dryad if it were open  
> source).  So we built this whole backend interface and (mostly) kept  
> Hadoop specific objects out of the front end.
>
> Recently we have modified that stand and said that this implementation  
> of Pig is Hadoop specific.  Pig Latin itself will still stay Hadoop  
> independent.  So the ability to have multiple backends is fine.  But  
> the ability to have non-Hadoop backends is not really interesting now.
>
> So I at least see the proposal here as getting rid of generic code  
> that tries to hide the fact that we are working on top of Hadoop  
> (things like DataStorage and ExecutionEngine).
>
> Alan.
>
> On Apr 22, 2010, at 4:14 PM, Arun C Murthy wrote:
>
>   
>> I read it as getting rid of concepts parallel to hadoop in  src/org/ 
>> apache/pig/backend/hadoop/datastorage.
>>
>> Is that true?
>>
>> thanks,
>> Arun
>>
>> On Apr 22, 2010, at 1:34 PM, Dmitriy Ryaboy wrote:
>>
>>     
>>> I kind of dig the concept of being able to plug in a different  
>>> backend,
>>> though I definitely thing we should get rid of the dead localmode  
>>> code. Can
>>> you give an example of how this will simplify the codebase? Is it  
>>> more than
>>> just GenericClass foo = new SpecificClass(), and the associated  
>>> extra files?
>>>
>>> -D
>>>
>>> On Thu, Apr 22, 2010 at 1:25 PM, Arun C Murthy <acm@yahoo-inc.com>  
>>> wrote:
>>>
>>>       
>>>> +1
>>>>
>>>> Arun
>>>>
>>>>
>>>> On Apr 22, 2010, at 11:35 AM, Richard Ding wrote:
>>>>
>>>> Pig has an abstraction layer (interfaces and abstract classes) to
>>>>         
>>>>> support multiple execution engines. After PIG-1053, Hadoop is the  
>>>>> only
>>>>> execution engine supported by Pig. I wonder if we should remove  
>>>>> this
>>>>> layer of code, and make Hadoop THE execution engine for Pig. This  
>>>>> will
>>>>> simplify a lot the backend code.
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Richard
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>           
>
>   


Mime
View raw message