accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4195) Generalized configuration object for Accumulo rfile interaction
Date Mon, 25 Apr 2016 16:37:13 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256556#comment-15256556
] 

ASF GitHub Bot commented on ACCUMULO-4195:
------------------------------------------

Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/95#discussion_r60945053
  
    --- Diff: core/src/main/java/org/apache/accumulo/core/file/FileOperations.java ---
    @@ -48,38 +48,295 @@ public static FileOperations getInstance() {
         return new DispatchingFileFactory();
       }
     
    +  //
    +  // Abstract methods (to be implemented by subclasses)
    +  //
    +
    +  protected abstract long getFileSize(GetFileSizeOperation options) throws IOException;
    +
    +  protected abstract FileSKVWriter openWriter(OpenWriterOperation options) throws IOException;
    +
    +  protected abstract FileSKVIterator openIndex(OpenIndexOperation options) throws IOException;
    +
    +  protected abstract FileSKVIterator openScanReader(OpenScanReaderOperation options)
throws IOException;
    +
    +  protected abstract FileSKVIterator openReader(OpenReaderOperation options) throws IOException;
    +
    +  //
    +  // File operations
    +  //
    +
       /**
    -   * Open a reader that will not be seeked giving an initial seek location. This is useful
for file operations that only need to scan data within a range and do
    -   * not need to seek. Therefore file metadata such as indexes does not need to be kept
in memory while the file is scanned. Also seek optimizations like bloom
    -   * filters do not need to be loaded.
    +   * Construct an operation object allowing one to query the size of a file. <br>
    +   * Syntax:
        *
    +   * <pre>
    +   * long size = fileOperations.getFileSize().ofFile(filename, fileSystem, fsConfiguration).withTableConfiguration(tableConf).execute();
    +   * </pre>
        */
    +  public GetFileSizeOperation getFileSize() {
    +    return new GetFileSizeOperation();
    +  }
     
    -  public abstract FileSKVIterator openReader(String file, Range range, Set<ByteSequence>
columnFamilies, boolean inclusive, FileSystem fs, Configuration conf,
    -      RateLimiter readLimiter, AccumuloConfiguration tableConf) throws IOException;
    +  /**
    +   * Construct an operation object allowing one to create a writer for a file. <br>
    +   * Syntax:
    +   *
    +   * <pre>
    +   * FileSKVWriter writer = fileOperations.openWriter()
    +   *     .ofFile(...)
    +   *     .withTableConfiguration(...)
    +   *     .withRateLimiter(...) // optional
    +   *     .withCompression(...) // optional
    +   *     .execute();
    +   * </pre>
    +   */
    +  public OpenWriterOperation openWriter() {
    +    return new OpenWriterOperation();
    +  }
    +
    +  /**
    +   * Construct an operation object allowing one to create an index iterator for a file.
<br>
    +   * Syntax:
    +   *
    +   * <pre>
    +   * FileSKVIterator iterator = fileOperations.openIndex()
    +   *     .ofFile(...)
    +   *     .withTableConfiguration(...)
    +   *     .withRateLimiter(...) // optional
    +   *     .withBlockCache(...) // optional
    +   *     .execute();
    +   * </pre>
    +   */
    +  public OpenIndexOperation openIndex() {
    +    return new OpenIndexOperation();
    +  }
     
    -  public abstract FileSKVIterator openReader(String file, Range range, Set<ByteSequence>
columnFamilies, boolean inclusive, FileSystem fs, Configuration conf,
    -      RateLimiter readLimiter, AccumuloConfiguration tableConf, BlockCache dataCache,
BlockCache indexCache) throws IOException;
    +  /**
    +   * Construct an operation object allowing one to create a "scan" reader for a file.
Scan readers do not have any optimizations for seeking beyond their
    +   * initial position. This is useful for file operations that only need to scan data
within a range and do not need to seek. Therefore file metadata such as
    +   * indexes does not need to be kept in memory while the file is scanned. Also seek
optimizations like bloom filters do not need to be loaded. <br>
    +   * Syntax:
    +   *
    +   * <pre>
    +   * FileSKVIterator scanner = fileOperations.openScanReader()
    +   *     .ofFile(...)
    +   *     .overRange(...)
    +   *     .withTableConfiguration(...)
    +   *     .withRateLimiter(...) // optional
    +   *     .withBlockCache(...) // optional
    +   *     .execute();
    +   * </pre>
    +   */
    +  public OpenScanReaderOperation openScanReader() {
    +    return new OpenScanReaderOperation();
    +  }
     
       /**
    -   * Open a reader that fully support seeking and also enable any optimizations related
to seeking, like bloom filters.
    +   * Construct an operation object allowing one to create a reader for a file. A reader
constructed in this manner fully supports seeking, and also enables any
    +   * optimizations related to seeking (e.g. Bloom filters). <br>
    +   * Syntax:
        *
    +   * <pre>
    +   * FileSKVIterator scanner = fileOperations.openReader()
    +   *     .ofFile(...)
    +   *     .withTableConfiguration(...)
    +   *     .withRateLimiter(...) // optional
    +   *     .withBlockCache(...) // optional
    +   *     .seekToBeginning(...) // optional
    +   *     .execute();
    +   * </pre>
    +   */
    +  public OpenReaderOperation openReader() {
    +    return new OpenReaderOperation();
    +  }
    +
    +  //
    +  // Operation objects.
    +  //
    +
    +  /**
    +   * Options common to all FileOperations.
    +   */
    +  protected static class FileAccessOperation<SubclassType extends FileAccessOperation<SubclassType>>
{
    +    private AccumuloConfiguration tableConfiguration;
    +
    +    private String filename;
    +    private FileSystem fs;
    +    private Configuration fsConf;
    +
    +    /** Specify the table configuration defining access to this file. */
    +    @SuppressWarnings("unchecked")
    +    public SubclassType withTableConfiguration(AccumuloConfiguration tableConfiguration)
{
    +      this.tableConfiguration = tableConfiguration;
    +      return (SubclassType) this;
    +    }
    +
    +    /** Specify the file this operation should apply to. */
    +    @SuppressWarnings("unchecked")
    +    public SubclassType ofFile(String filename, FileSystem fs, Configuration fsConf)
{
    +      this.filename = filename;
    +      this.fs = fs;
    +      this.fsConf = fsConf;
    +      return (SubclassType) this;
    +    }
    +
    +    public String getFilename() {
    +      return filename;
    +    }
    +
    +    public FileSystem getFileSystem() {
    +      return fs;
    +    }
    +
    +    public Configuration getConfiguration() {
    +      return fsConf;
    +    }
    +
    +    public AccumuloConfiguration getTableConfiguration() {
    --- End diff --
    
    I'm wondering what the (heavy) use of Generics here is buying us when compared to the
writeup that Keith shared. I can see what you were thinking, but it seems like the changes
might be simpler if we just tied the each option to the subsequent option (e.g. NeedsFile
to NeedsTableConfiguration). I'm not super-stuck on this, it just feels very... verbose.


> Generalized configuration object for Accumulo rfile interaction
> ---------------------------------------------------------------
>
>                 Key: ACCUMULO-4195
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4195
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Josh Elser
>            Assignee: Shawn Walker
>             Fix For: 1.8.0
>
>
> Taken from https://github.com/apache/accumulo/pull/90/files#r59489073
> On [~ShawnWalker]'s PR for ACCUMULO-4187 which adds rate-limiting on major compactions,
we noted that many of the changes were related to passing an extra argument (RateLimiter)
around through all of the code which is related to file interaction.
> It would be nice to move to a centralized configuration object instead of having to add
a new argument every time some new feature is added to the file-path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message