accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Vines (Resolved) (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (ACCUMULO-146) Accumulo Output Format needs better fix for empty files (see Accumulo-55)
Date Thu, 15 Mar 2012 10:09:37 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

John Vines resolved ACCUMULO-146.
---------------------------------

    Resolution: Fixed
    
> Accumulo Output Format needs better fix for empty files (see Accumulo-55)
> -------------------------------------------------------------------------
>
>                 Key: ACCUMULO-146
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-146
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: John Vines
>            Assignee: John Vines
>            Priority: Minor
>             Fix For: 1.5.0
>
>
> In conjuction with Accumulo-52, large amounts of empty files can cause problems. The
short problem is when a reducer is empty, due to the partitioner used, the file for it will
still be created. We do not want empty files lingering around, especially do not want them
bulk imported. It should be as simple as either not creating the file until a write on it
is attempted (more complex) or the file should be deleted at close time if there were no records
written (simpler but more overhead due to file creation and deletion in the process).
> Due to the complexity of the patch, I do not think it should be applied before the 1.4
version. It should simply delete the file after closing it if there are no writes to the file.
> EDIT: As of 1.4 we now delete empty files on close() in the RecordWriter. I would like
to implement a more robust version which does not create a file until the first write. I will
do this for version 1.5 so as not to worry about breaking things.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message