pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pi Song (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-167) Experiment : A proper bag memory manager.
Date Mon, 24 Mar 2008 22:27:24 GMT

    [ https://issues.apache.org/jira/browse/PIG-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581699#action_12581699

Pi Song commented on PIG-167:

For lock contention, you're right.

> Experiment : A proper bag memory manager.
> -----------------------------------------
>                 Key: PIG-167
>                 URL: https://issues.apache.org/jira/browse/PIG-167
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Pi Song
>         Attachments: diagram.gif, MemManager0.patch
> According to PIG-164, I think we still have room for improvement:-
> 1) Alan said
> {quote}
> "It rests on the assumption that data bags generally live about the same amount of time,
thus there won't be a long lived databag at the head of the list blocking the cleaning of
many stale references later in the list."
> {quote}
> By looking at a line of code in SpillableMemoryManager
> {noformat}
> Collections.sort(spillables, new Comparator<WeakReference<Spillable>>() {
> {noformat}
> - Alan's assumption might be wrong after the memory manager tries to spill the list.
> - I don't understand why this has to be sorted and start spilling from the smallest bags
first. Most file systems are not good at handling small files (specially ext2/ext3).
> 2) We use a linkedlist to maintain WeakReference. Normally a linkedlist consumes double
as much memory that an array would consume(for pointers). Should it be better to change LinkedList
to Array or ArrayList?
> 3) In SpillableMemoryManager, handleNotification which does a kind of I/O intensive job
shares the same lock with registerSpillable. This doesn't seem to be efficient.
> 4) Sometimes I recognized that the bag currently in use got spilled and read back over
and over again. Essentially, the memory manager should consider spilling bags currently not
in use first.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message