accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dylan Hutchison <dhutc...@cs.washington.edu>
Subject Re: Running Accumulo on a standard file system, without Hadoop
Date Mon, 16 Jan 2017 22:39:12 GMT
On Mon, Jan 16, 2017 at 1:56 PM, Josh Elser <josh.elser@gmail.com> wrote:

> That's true, but HDFS supports multiple "implementations" based on the
> scheme of the URI being used.
>
> e.g. hdfs:// is mapped to DistributedFileSystem
>
> You can configure HDFS to use the RawLocalFileSystem class for file://
> URIs which is what is done for a majority of the integration tests. Beware
> that you configure the RawLocalFileSystem as the ChecksumFileSystem
> (default for file://) will fail miserably around WAL recovery.
>
> https://github.com/apache/accumulo/blob/master/test/src/main
> /java/org/apache/accumulo/test/BulkImportVolumeIT.java#L61
>
>
Hi Josh, are you saying that the ChecksumFileSystem is required or
forbidden for WAL recovery?  Looking at the Hadoop code it seems that
LocalFileSystem wraps around a RawLocalFileSystem to provide checksum
capabilities.  Is that right?


>
> Dave Marion wrote:
>
>> IIRC, Accumulo *only* uses the HDFS client, so it needs something on the
>> other side that can respond to that protocol. MiniAccumulo starts up
>> MiniHDFS for this. You could run some other type of service locally that is
>> HDFS client compatible (something like Quantcast QFS[1], setting up client
>> [2]). If Accumulo is using something in Hadoop outside of the public client
>> API, this may not work.
>>
>> [1] https://github.com/quantcast/qfs
>> [2] https://github.com/quantcast/qfs/wiki/Migration-Guide
>>
>>
>> -----Original Message-----
>>> From: Dylan Hutchison [mailto:dhutchis@cs.washington.edu]
>>> Sent: Monday, January 16, 2017 3:17 PM
>>> To: dev@accumulo.apache.org
>>> Subject: Running Accumulo on a standard file system, without Hadoop
>>>
>>> Hi folks,
>>>
>>> A friend of mine asked about running Accumulo on a normal file system in
>>> place of Hadoop, similar to the way MiniAccumulo runs.  How possible is
>>> this,
>>> or how much work would it take to do so?
>>>
>>> I think my friend is just interested in running on a single node, but I
>>> am
>>> curious about both the single-node and distributed (via parallel file
>>> system
>>> like Lustre) cases.
>>>
>>> Thanks, Dylan
>>>
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message