Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4110418237 for ; Tue, 23 Feb 2016 06:12:20 +0000 (UTC) Received: (qmail 74085 invoked by uid 500); 23 Feb 2016 06:12:19 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 74034 invoked by uid 500); 23 Feb 2016 06:12:19 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 74022 invoked by uid 99); 23 Feb 2016 06:12:19 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Feb 2016 06:12:19 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 3EAB62C1F57 for ; Tue, 23 Feb 2016 06:12:19 +0000 (UTC) Date: Tue, 23 Feb 2016 06:12:19 +0000 (UTC) From: "Aaron Fabbri (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-12666) Support Microsoft Azure Data Lake - as a file system in Hadoop MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158357#comment-15158357 ] Aaron Fabbri commented on HADOOP-12666: --------------------------------------- Thank your for your contributions. This is a large patch with some dense spots, which makes it hard for folks to get time to review properly. In the future you should break up the work into multiple commits and associate patches with jira subtasks. This will make your life easier as well. Summary of issues, this round: 1. Still some parts I haven't carefully reviewed due to size of patch. 2. FileStatusCacheManager seems to have local race conditions and zero intra-node coherency. 3. Seems like abuse of volatile / lack of locking in BatchByteArrayInputStream. 4. How do Hadoop folks feel about this hadoop-tools/hadoop-azure-datalake code declaring classes in the hadoop.hdfs.web package? I feel it needs cleanup. 5. Still need to put config parms in core-default.xml and make names lower case. There are a bunch of other comments / questions inline below. Search for "AF>" {quote} diff --git hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/FileStatusCacheManager.java hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/FileStatusCacheManager.java new file mode 100644 index 0000000..fd6a2ff --- /dev/null +++ hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/FileStatusCacheManager.java @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * ACID properties are maintained in overloaded api in @see + * PrivateAzureDataLakeFileSystem class. + */ +public final class FileStatusCacheManager { + private static final FileStatusCacheManager FILE_STATUS_CACHE_MANAGER = new + FileStatusCacheManager(); + private Map syncMap = null; + + /** + * Constructor. + */ + private FileStatusCacheManager() { AF> This class seems to have serious issues that need addressing: 1. Local race conditions in caller PrivateAzureDataLakeFileSystem 2. No mechanism for cache invalidation across nodes in the cluster. + LinkedHashMap map = new + LinkedHashMap() { + + private static final int MAX_ENTRIES = 5000; + + @Override + protected boolean removeEldestEntry(Map.Entry eldest) { diff --git hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/FileStatusCacheObject.java hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/FileStatusCacheObject.java new file mode 100644 index 0000000..5316443 --- /dev/null +++ hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/FileStatusCacheObject.java @@ -0,0 +1,59 @@ diff --git hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/PrivateAzureDataLake.java hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/PrivateAzureDataLake.java new file mode 100644 index 0000000..a0ca4a9 --- /dev/null +++ hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/PrivateAzureDataLake.java @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + +/** + * Create ADL filesystem delegation with Swebhdfs scheme. Intent to use by + * AdlFileSystem only. + */ AF> Update comment? This uses "adl" scheme, right? +public class PrivateAzureDataLake extends DelegateToFileSystem { + public static final int DEFAULT_PORT = 443; AF> What is this class used for? I didn't see any uses. + + PrivateAzureDataLake(URI theUri, Configuration conf) + throws IOException, URISyntaxException { + super(theUri, createFileSystem(conf), conf, + PrivateAzureDataLakeFileSystem.SCHEME, false); diff --git hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/PrivateAzureDataLakeFileSystem.java hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/PrivateAzureDataLakeFileSystem.java new file mode 100644 index 0000000..db4a83c --- /dev/null +++ hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/hdfs/web/PrivateAzureDataLakeFileSystem.java @@ -0,0 +1,1516 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * limitations under the License. + * + */ + +package org.apache.hadoop.hdfs.web; + AF> Care to comment why this is in the ..hdfs.web package instead of fs.adl? It lives in hadoop-tools/hadoop-azure-datalake in the source tree. +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.BlockLocation; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicInteger; + +/** + * Extended @see SWebHdfsFileSystem API. This class contains Azure data lake + * specific stability, Reliability and performance improvement. + *

+ * Motivation behind PrivateAzureDataLakeFileSystem AF> ? + */ +public class PrivateAzureDataLakeFileSystem extends SWebHdfsFileSystem { + + public static final String SCHEME = "adl"; + /** + * Process wide thread pool for data lake file system. + * Threads need to be daemon so that they dont prevent the process from + * exiting + */ + private static final ExecutorService EXECUTOR = Executors + .newCachedThreadPool(new ThreadFactory() { + public Thread newThread(Runnable r) { + Thread t = Executors.defaultThreadFactory().newThread(r); + t.setDaemon(true); + return t; + } + }); + private static String hostName = null; + private static AtomicInteger metricsSourceNameCounter = new AtomicInteger(); + // Feature configuration + // Publicly Exposed + + /** + * Need to override default getHomeDirectory implementation due to + * HDFS-8542 causing MR jobs to fail in initial AF> Due to the bug or due to the fix? The fix was merged in 2.8.0, right? + * phase. Constructing home directory locally is fine as long as hadoop + * local user name and ADL user name relation + * ship is not agreed upon. AF> I'm not understanding this last sentence, can you explain? + * + * @return Hadoop user home directory. + */ + @Override + public final Path getHomeDirectory() { + try { + return makeQualified(new Path( + "/user/" + UserGroupInformation.getCurrentUser().getShortUserName())); + } catch (IOException e) { + } + + return new Path("/user/" + userName); + } + + */ + @Override + public final boolean setReplication(final Path p, final short replication) + throws IOException { + return true; + } + + /** + * Invoked parent setTimes default implementation only. + * + * Removes cached FileStatus entry to maintain latest information on the + * FileStatus instance + * + * @param p File/Folder path + * @param mtime Modification time + * @param atime Access time + * @throws IOException when system error, internal server error or user error + */ + @Override + public final void setTimes(final Path p, final long mtime, final long atime) + throws IOException { + if (featureCacheFileStatus) { + String filePath = p.isAbsoluteAndSchemeAuthorityNull() ? + getUri() + p.toString() : + p.toString(); + fileStatusCacheManager.remove(new Path(filePath)); + } + super.setTimes(p, mtime, atime); + } + + /** + * Invokes parent setPermission default implementation only. + * + * Removes cached FileStatus entry to maintain latest information on the + * FileStatus instance + * + * @param p File/Folder path + * @param permission Instance FsPermission. Octal values + * @throws IOException when system error, internal server error or user error + */ + @Override + public final void setPermission(final Path p, final FsPermission permission) + throws IOException { + if (featureCacheFileStatus) { + String filePath = p.isAbsoluteAndSchemeAuthorityNull() ? + getUri() + p.toString() : + p.toString(); + fileStatusCacheManager.remove(new Path(filePath)); + } + super.setPermission(p, permission); + } + + /** + * Avoid call to Azure data lake backend system. Look in the local cache if + * FileStatus from the previous call has + * already been cached. + * + * Cache lookup is default enable. and can be set using configuration. + * + * @param f File/Folder path + * @return FileStatus instance containing metadata information of f + * @throws IOException For any system error + */ + @Override + public FileStatus getFileStatus(Path f) throws IOException { + statistics.incrementReadOps(1); + FileStatus status = null; + if (featureCacheFileStatus) { + status = fileStatusCacheManager.get(makeQualified(f)); + } + + if (status == null) { + status = super.getFileStatus(f); + } else { + ADLLogger.log("Cached Instance Found : " + status.getPath()); + } + + if (featureCacheFileStatus) { + if (fileStatusCacheManager.get(makeQualified(f)) == null) { + fileStatusCacheManager.put(status, featureCacheFileStatusDuration); + } + } AF> Is this a race condition? thread 1> getFileStatus(), cache miss super.getStatus -> s1 cache.get() -> null thread 2> delete() cache.clear() thread 1> cache.put(s1) Maybe provide an atomic putIfAbsent() for FileStatusCacheManager. You can synchronize on the underlying map object I believe (see Collections.synchronizedMap()). + + if (overrideOwner) { + FileStatus proxiedStatus = new FileStatus(status.getLen(), + status.isDirectory(), status.getReplication(), status.getBlockSize(), + status.getModificationTime(), status.getAccessTime(), + status.getPermission(), userName, "hdfs", status.getPath()); + return proxiedStatus; + } else { + return status; + } + } + + /** + * Invokes parent delete() default implementation only. + * + * Removes cached FileStatus entry to maintain latest information on the + * FileStatus instance + * + * @param f File/Folder path + * @param recursive true if the contents within folder needs to be removed + * as well + * @return true if the delete operation is successful other false. + * @throws IOException For any system exception + */ + @Override + public boolean delete(Path f, boolean recursive) throws IOException { + if (featureCacheFileStatus) { + FileStatus fs = fileStatusCacheManager.get(makeQualified(f)); + if (fs != null && fs.isFile()) { + fileStatusCacheManager.remove(makeQualified(f)); + } else { + fileStatusCacheManager.clear(); AF> Seems like there is a less-likely race condition here. (f is replaced by a directory after checking fs.isFile()) + } + } + instrumentation.fileDeleted(); + return super.delete(f, recursive); + } + + /** + * Invokes parent rename default implementation only. + * + * Removes cached FileStatus entry to maintain latest information on the + * FileStatus instance * + * + * @param src Source path + * @param dst Destination path + * @return True if the rename operation is successful otherwise false + * @throws IOException For any system error. + */ + @Override + public boolean rename(final Path src, final Path dst) throws IOException { + if (featureCacheFileStatus) { + FileStatus fsSrc = fileStatusCacheManager.get(makeQualified(src)); + FileStatus fsDst = fileStatusCacheManager.get(makeQualified(dst)); + + if ((fsSrc != null && !fsSrc.isFile()) || (fsDst != null && !fsDst + .isFile())) { + fileStatusCacheManager.clear(); + } else { + fileStatusCacheManager.remove(makeQualified(src)); + fileStatusCacheManager.remove(makeQualified(dst)); + } + } + return super.rename(src, dst); AF> Similar pattern of get/mutate non-atomically repeats here and below. + } + + /** + * Overloaded version of rename. Invokes parent rename implementation only. + * + * Removes cached FileStatus entry to maintain latest information on the + * FileStatus instance + * + * @param src Source path + * @param dst Desitnation path + * @param options Defined in webhdfs specification + * @throws IOException For system error + */ + @Override + public void rename(Path src, Path dst, Options.Rename... options) + throws IOException { + if (featureCacheFileStatus) { + FileStatus fsSrc = fileStatusCacheManager.get(makeQualified(src)); + FileStatus fsDst = fileStatusCacheManager.get(makeQualified(dst)); + + if ((fsSrc != null && !fsSrc.isFile()) || (fsDst != null && !fsDst + .isFile())) { + fileStatusCacheManager.clear(); + } else { + fileStatusCacheManager.remove(makeQualified(src)); + fileStatusCacheManager.remove(makeQualified(dst)); + } + } + super.rename(src, dst, options); + } + + /** + * Invokes parent append default implementation + * + * Removes cached FileStatus entry to maintain latest information on the + * FileStatus instance. + * + * @param f Stream path + * @return Output stream. + * @throws IOException For system error + */ + @Override + public FSDataOutputStream append(Path f) throws IOException { + String filePath = makeQualified(f).toString(); + fileStatusCacheManager.remove(new Path(filePath)); + return super.append(f); + } + + /** + * Removes cached FileStatus entry to maintain latest information on the + * FileStatus instance. + * + * @param f Existing file path + * @param bufferSize Size of the buffer + * @param progress Progress indicator + * @return FSDataOutputStream OutputStream on which application can push + * stream of bytes + * @throws IOException For any system exception + */ + @Override + public FSDataOutputStream append(Path f, int bufferSize, + Progressable progress) throws IOException { + String filePath = makeQualified(f).toString(); + fileStatusCacheManager.remove(new Path(filePath)); + return super.append(f, bufferSize, progress); + } + + /** + * Removes cached FileStatus entry to maintain latest information on the + * FileStatus instance. + * + * @param trg Target file path + * @param srcs List of sources to be concatinated. ADL concatinate in the + * same order passed as parameter. + * @throws IOException For any system exception + */ + @Override + public void concat(final Path trg, final Path[] srcs) throws IOException { + if (featureCacheFileStatus) { + String filePath = trg.isAbsoluteAndSchemeAuthorityNull() ? + getUri() + trg.toString() : + trg.toString(); + fileStatusCacheManager.remove(new Path(filePath)); + for (int i = 0; i < srcs.length; ++i) { + filePath = srcs[0].isAbsoluteAndSchemeAuthorityNull() ? + getUri() + srcs[0].toString() : + srcs[0].toString(); + fileStatusCacheManager.remove(new Path(filePath)); + } + } + super.concat(trg, srcs); + } + + * failures. + * 2. Performance boost to jobs which are slow writer, avoided network latency + * 3. ADL equally better performing with multiple of 4MB chunk as append + * calls. + * + * @param f File path + * @param permission Access perfrmission for the newly created file AF> typo + * @param overwrite Remove existing file and recreate new one if true + * otherwise throw error if file exist + * @param bufferSize Buffer size, ADL backend does not honour + * @param replication Replication count, ADL backen does not hounour + * @param blockSize Block size, ADL backend does not honour + * @param progress Progress indicator + * @return FSDataOutputStream OutputStream on which application can push + * stream of bytes + * @throws IOException when system error, internal server error or user error + */ + @Override + public FSDataOutputStream create(final Path f, final FsPermission permission, + final boolean overwrite, final int bufferSize, final short replication, + final long blockSize, final Progressable progress) throws IOException { + statistics.incrementWriteOps(1); + // Increment the counter + instrumentation.fileCreated(); + + if (featureCacheFileStatus) { + fileStatusCacheManager.remove(makeQualified(f)); + } + + return new FSDataOutputStream(new BatchAppendOutputStream(f, bufferSize, + new PermissionParam(applyUMask(permission)), + new OverwriteParam(overwrite), new BufferSizeParam(bufferSize), + new ReplicationParam(replication), new BlockSizeParam(blockSize), + new ADLVersionInfo(getADLEnabledFeatureSet())), statistics) { + }; + } + + @Override + @SuppressWarnings("deprecation") + public FSDataOutputStream createNonRecursive(final Path f, + final FsPermission permission, final EnumSet flag, + final int bufferSize, final short replication, final long blockSize, + final Progressable progress) throws IOException { + statistics.incrementWriteOps(1); + // Increment the counter + instrumentation.fileCreated(); + + if (featureCacheFileStatus) { + String filePath = makeQualified(f).toString(); + fileStatusCacheManager.remove(new Path(filePath)); + } + + String leaseId = java.util.UUID.randomUUID().toString(); + return new FSDataOutputStream(new BatchAppendOutputStream(f, bufferSize, + new PermissionParam(applyUMask(permission)), new CreateFlagParam(flag), + new CreateParentParam(false), new BufferSizeParam(bufferSize), + new ReplicationParam(replication), new LeaseParam(leaseId), + new BlockSizeParam(blockSize), + new ADLVersionInfo(getADLEnabledFeatureSet())), statistics) { + }; + } + + /** + * Since defined as private in parent class, redefined to pass through + * Create api implementation. + * + * @param permission + * @return FsPermission list + */ + private FsPermission applyUMask(FsPermission permission) { + FsPermission fsPermission = permission; + if (fsPermission == null) { + fsPermission = FsPermission.getDefault(); + } + return fsPermission.applyUMask(FsPermission.getUMask(getConf())); + } + + /** + * Open call semantic is handled differently in case of ADL. Instead of + * network stream is returned to the user, + * Overridden FsInputStream is returned. + * + * 1. No dedicated connection to server. + * 2. Process level concurrent read ahead Buffering is done, This allows + * data to be available for caller quickly. + * 3. Number of byte to read ahead is configurable. + * + * Advantage of Process level concurrent read ahead Buffering semantics is + * 1. ADL backend server does not allow idle connection for longer duration + * . In case of slow reader scenario, + * observed connection timeout/Connection reset causing occasional job + * failures. AF> Did you guys consider handling this as transparently reconnecting, instead of doing separate connections for each op? Seems like performance would be alot better? + * 2. Performance boost to jobs which are slow reader, avoided network latency AF> I'd expect you to want a connection per-thread, instead of per-op. + * 3. Compressed format support like ORC, and large data files gains the + * most out of this implementation. + * + * Read ahead feature is configurable. + * + * @param f File path + * @param buffersize Buffer size + * @return FSDataInputStream InputStream on which application can read + * stream of bytes + * @throws IOException when system error, internal server error or user error + */ + @Override + public FSDataInputStream open(final Path f, final int buffersize) + throws IOException { + long statContructionTime = System.currentTimeMillis(); + statistics.incrementReadOps(1); + + ADLLogger.log("statistics report Time " + (System.currentTimeMillis() + - statContructionTime)); + + final HttpOpParam.Op op = GetOpParam.Op.OPEN; + // use a runner so the open can recover from an invalid token + FsPathConnectionRunner runner = null; + + if (featureConcurrentReadWithReadAhead) { + long urlContructionTime = System.currentTimeMillis(); + URL url = this.toUrl(op, f, new BufferSizeParam(buffersize), + new ReadADLNoRedirectParam(true), + new ADLVersionInfo(getADLEnabledFeatureSet())); + ADLLogger.log("URL Construction Time " + (System.currentTimeMillis() + - urlContructionTime)); + + long bbContructionTime = System.currentTimeMillis(); + BatchByteArrayInputStream bb = new BatchByteArrayInputStream(url, f, + maxBufferSize, maxConcurrentConnection); + ADLLogger.log("BatchByteArrayInputStream Construction Time " + ( + System.currentTimeMillis() - bbContructionTime)); + + long finContructionTime = System.currentTimeMillis(); + FSDataInputStream fin = new FSDataInputStream(bb); + ADLLogger.log( + "FSDataInputStream Construction Time " + (System.currentTimeMillis() + - finContructionTime)); AF> This case could use some perf optimization. e.g. Three calls to get system time. + return fin; + } else { + if (featureRedirectOff) { + long urlContructionTime = System.currentTimeMillis(); + runner = new FsPathConnectionRunner(ADLGetOpParam.Op.OPEN, f, + new BufferSizeParam(buffersize), new ReadADLNoRedirectParam(true), + new ADLVersionInfo(getADLEnabledFeatureSet())); + ADLLogger.log("Runner Construction Time " + (System.currentTimeMillis() + - urlContructionTime)); AF> How about adding ADLLogger.logWithTimestamp(). That way, if the logger is disabled, you don't keep getting system time. + } else { + runner = new FsPathConnectionRunner(op, f, + new BufferSizeParam(buffersize)); + } + + return new FSDataInputStream( + new OffsetUrlInputStream(new UnresolvedUrlOpener(runner), + new OffsetUrlOpener(null))); + } + } + + /** + * On successful response from the server, @see FileStatusCacheManger is + * updated with FileStatus objects. + * + * @param f File/Folder path + * @return FileStatus array list + * @throws IOException For system error + */ + @Override + public FileStatus[] listStatus(final Path f) throws IOException { + FileStatus[] fileStatuses = super.listStatus(f); + for (int i = 0; i < fileStatuses.length; i++) { + if (featureCacheFileStatus) { + fileStatusCacheManager + .put(fileStatuses[i], featureCacheFileStatusDuration); + } + + if (overrideOwner) { + fileStatuses[i] = new FileStatus(fileStatuses[i].getLen(), + fileStatuses[i].isDirectory(), fileStatuses[i].getReplication(), + fileStatuses[i].getBlockSize(), + fileStatuses[i].getModificationTime(), + fileStatuses[i].getAccessTime(), fileStatuses[i].getPermission(), + userName, "hdfs", fileStatuses[i].getPath()); + } + } + return fileStatuses; + } + + @Override + public BlockLocation[] getFileBlockLocations(final FileStatus status, + final long offset, final long length) throws IOException { + if (status == null) { + return null; + } + + if (featureGetBlockLocationLocallyBundled) { + if ((offset < 0) || (length < 0)) { + throw new IllegalArgumentException("Invalid start or len parameter"); + } + + if (status.getLen() < offset) { + if (ADLLogger.isLogEnabled()) { AF> Redundant check of isLogEnabled() + ADLLogger.log("getFileBlockLocations : Returning 1 block"); + } + return new BlockLocation[0]; + } + + final String[] name = {"localhost"}; + final String[] host = {"localhost"}; AF> Just use "name" twice instead of defining host? + long blockSize = ADLConfKeys.DEFAULT_EXTENT_SIZE; + if (blockSize <= 0) { AF> Why the runtime check of a compile-time constant? How about just add a comment near the definition "must be non-zero" + throw new IllegalArgumentException( + "The block size for the given file is not a positive number: " + + blockSize); + } + int numberOfLocations = + (int) (length / blockSize) + ((length % blockSize == 0) ? 0 : 1); + BlockLocation[] locations = new BlockLocation[numberOfLocations]; + for (int i = 0; i < locations.length; i++) { + long currentOffset = offset + (i * blockSize); + long currentLength = Math + .min(blockSize, offset + length - currentOffset); + locations[i] = new BlockLocation(name, host, currentOffset, + currentLength); + } + + if (ADLLogger.isLogEnabled()) { AF> Redundant check of isLogEnabled() + ADLLogger.log("getFileBlockLocations : Returning " + locations.length + + " Blocks"); + } + + return locations; + + } else { + return getFileBlockLocations(status.getPath(), offset, length); + } + } + + @Override + public BlockLocation[] getFileBlockLocations(final Path p, final long offset, + final long length) throws IOException { + statistics.incrementReadOps(1); + + if (featureGetBlockLocationLocallyBundled) { + FileStatus fileStatus = getFileStatus(p); + return getFileBlockLocations(fileStatus, offset, length); + } else { + return super.getFileBlockLocations(p, offset, length); + } + } + + @Override + public synchronized void close() throws IOException { + super.close(); + AdlFileSystemMetricsSystem.unregisterSource(metricsSourceName); + AdlFileSystemMetricsSystem.fileSystemClosed(); + } + + private String getADLEnabledFeatureSet() { + // TODO : Implement current feature set enabed for the instance. + // example cache file status, reah ahead .. + return ADLConfKeys.LOG_VERSION; + } + + enum StreamState { + Initial, + DataCachedInLocalBuffer, + StreamEnd + } + + class BatchAppendOutputStream extends OutputStream { + private Path fsPath; + private Param[] parameters; + private byte[] data = null; + private int offset = 0; + private long length = 0; + private boolean eof = false; + private boolean hadError = false; + private int bufferIndex = 0; + private byte[][] dataBuffers = new byte[2][]; + private int bufSize = 0; + private Future