accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Peterson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4078) Exclude special volumes
Date Sat, 19 Dec 2015 00:33:46 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065050#comment-15065050
] 

Matt Peterson commented on ACCUMULO-4078:
-----------------------------------------

Josh,

Thanks for reviewing this ticket.

{quote}
What's your intended effect from such a change? Remember that WALs are per-server, not per-table.
{quote}
I'd like to exclude the WAL from the special volume for all tables.  Let's assume that the
special instance is appropriately sized for only that one table.  The PerTableVolumeChooser
will default to the random chooser for WAL files.  The random chooser which will include all
volumes, including the one that I want to exclude.

{quote}
I don't understand what you mean by "use the sticky dir". Why would we not want to use the
volume chooser implementation when choosing a temp directory? This seems like an inconsistency
since we do this for all other file-based operations.
{quote}
Sorry I realize now that the language was unclear.  By "sticky dir" I was referring to the
srv:dir entry for that tablet.  The thought was that there is no need to choose a volume for
the temp dir in the same way that Accumulo doesn't choose a volume each time it writes a new
file for a tablet.  It chooses a volume when first creating the volume and then continues
to send its files there.

{quote}
How does the PerTableVolumeChooser fall short for what you'd like to do? I could see a static
VolumeChooser implementation configured for the table in question that always chooses a specific
volume. Every other table would use the default VolumeChooser (an implementation which ignores
that special volume?). Maybe PreferredVolumeChooser already gives you the ability to limit
the volumes you want to use (in a inclusive way, instead of exclusive).
By "failsafe" you mean a default chooser? I can see the value in mimicking how we do site-wide
and per-table configuration values. When a per-table implementation doesn't exist (or is not
relevant), it defers to the site-wide chooser. The approach we have now makes the provided
implementations a bit more brittle since you inherit some default functionality (which I think
is what you meant).
{quote}
Yes, by "failsafe" I meant a default chooser, to be used when none is configured.  Additionally,
it would be used when a chooser is configured but cannot be used due to an error in its instantiation
or configuration.

There are cases with both the PreferredVolumeChooser and the PerTableVolumeChooser in which
each will default to the RandomVolumeChooser, even if a site-wide default configuration is
used.

If the PerTableVolumeChooser is used as the general volume chooser and PreferredVolumeChooser
is used as the site-wide table volume chooser, then this use case is nearly satisfied.  The
special table would be configured to include only the special volume.  Most tables would inherit
the site-wide configuration, which excludes the special volume.  With this configuration,
which I believe is the one you've suggested, there are a few cases when the special volume
will be used: 
1) .choose is called without a table id, resulting in files going to this special volume (e.g.
WAL), (PreferredVolumeChooser and PerTableVolumeChooser behavior)
2) a table overrides the site-wide configuration but that override is misconfigured and fails,
causing files to go to the RandomVolumeChooser (PerTableVolumeChooser behavior)

PreferredVolumeChooser and PerTableVolumeChooser could use a configurable default, other than
the RandomVolumeChooser.  So there would be a general.volume.chooser and a general.volume.chooser.fallback.

> Exclude special volumes
> -----------------------
>
>                 Key: ACCUMULO-4078
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4078
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.7.0
>            Reporter: Matt Peterson
>            Priority: Minor
>              Labels: newbie
>             Fix For: 1.8.0
>
>
> A few improvements to the VolumeChooser are desired for a use case that seems general
enough to warrant an update to Accumulo.  
> Use Case:
> A new volume is added, with limited capacity, to be dedicated for a specific table. 
All other tables or files should be excluded from this new volume.
> Suggested Improvements:
> 1. Update the signature for VolumeManager.choose to take a VolumeChooserEnvironment instead
of Optional<String>.  This will allow future parameters for volume selection without
repeatedly changing the VolumeManager interface.
> 2. It's not currently possible to specify preferred volumes for the write-ahead logs
> 3. In several places including PreferredVolumeChooser, PerTableVolumeChooser and VolumeManagerImpl,
the failsafe chooser is the RandomVolumeChooser which will include the instance volume that
needs to be excluded.  It would be useful to have a configurable failsafe in this situation.
> 4. The volume chooser is called in FileUtils for temp directory creation but it could
instead use the sticky dir to create the temp directory, not needing the volume chooser at
all.
> The above suggestions could become sub-tickets.  Note that the improvements listed have
been implemented for my particular instance and I hope to submit them as a patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message