accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Resolved] (ACCUMULO-418) Make RFiles splittable
Date Mon, 22 Apr 2019 19:56:00 GMT


Christopher Tubbs resolved ACCUMULO-418.
    Resolution: Won't Fix

This is quite old. There have been numerous changes since this issue was created, which may
help mitigate the issue, and there are other possible external solutions to this.

1. There is now an RFile API.
2. A user could create their own InputFormat with InputSplit types that accept a file name
and a range.

In any case, if this is still an issue that somebody wishes to pursue, please open an issue
on GitHub, where we now track issues:

> Make RFiles splittable
> ----------------------
>                 Key: ACCUMULO-418
>                 URL:
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: master, tserver
>         Environment: All
>            Reporter: Ivan Bella
>            Priority: Major
>              Labels: RFile, hadoop, mapreduce
>   Original Estimate: 72h
>  Remaining Estimate: 72h
> There are times when iterating over RFiles is useful in map-reduce jobs.  I know that
RFiles logically can be split on the block boundary, however there is no easy way to do this
currently as there is no RFile RecordReader or InputFormat provided.

This message was sent by Atlassian JIRA

View raw message