hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-30) Get rid of DataBag and always use BigDataBag
Date Wed, 12 Dec 2007 18:28:43 GMT

    [ https://issues.apache.org/jira/browse/PIG-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551056
] 

Olga Natkovich commented on PIG-30:
-----------------------------------

A couple of other issues I observed with BigDataBag:

- Should check memory availability periodically, not on every add
- Try to buffer in memory first. Currently we always write to disk after the first spill


> Get rid of DataBag and always use BigDataBag
> --------------------------------------------
>
>                 Key: PIG-30
>                 URL: https://issues.apache.org/jira/browse/PIG-30
>             Project: Pig
>          Issue Type: Bug
>          Components: data
>            Reporter: Benjamin Reed
>            Assignee: Alan Gates
>
> We should never use DataBag directly; instead, we should always use BigDataBag. I think
we already do this. The problem is that the logic in BigDataBag is hard to follow and it is
made more complicated because it subclasses DataBag. We should merge these two classes together.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message