accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-578) consider using hdfs for the walog
Date Mon, 21 May 2012 11:30:40 GMT


Eric Newton commented on ACCUMULO-578:

The loggers implemented another feature I forgot about which was to atomically determine which
logs were not open, and therefore available for collection.  We need a new way to determine

h3. One approach:
# write references to the log into the !METADATA table, along with a tserver id
# open the file and begin using it
# when the tablet server closes the log, it removes the tserver id
# gc doesn't collect files with good tserver ids or METADATA table references

h3. Bonus points:
# move the majority of log file collection to the tablet servers: they know when their tablets
have lost references to the log
# master/tservers can sort the logs during recovery, and delete the unsorted copy
# tablet servers can gc the recovery log, after a check of the METADATA table
#* Need to avoid problems like ACCUMULO-598
> consider using hdfs for the walog
> ---------------------------------
>                 Key: ACCUMULO-578
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: logger, tserver
>    Affects Versions: 1.5.0-SNAPSHOT
>            Reporter: Eric Newton
>            Assignee: Eric Newton
> Using HDFS for walogs would fix:
>  * ACCUMULO-84: any node can read the replicated files
>  * ACCUMULO-558: wouldn't need to monitor loggers
>  * ACCUMULO-544: log references wouldn't include hostnames
>  * ACCUMULO-423: wouldn't need to monitor loggers
>  * ACCUMULO-258: hdfs has load balancing already
> To implement it, we would need the ability to distribute log sorts.
> Continuing to use loggers helps us avoid:
>  * hdfs pipeline strategy
>  * we don't have fine-grained insight when a single node makes dfs slow
>  * additional namenode pressure
>  * flexibility: for example, we can add fadvise() calls to the logger before HDFS supports

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message