accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Re: Anybody ever used the HDFS NFS Gateway?
Date Wed, 07 Oct 2015 02:30:27 GMT
One aspect of creating rfiles for importing into Accumulo that I don't
recall mentioned before is the ability to archive them for future use.

On Tue, Oct 6, 2015 at 10:25 PM, Russ Weeks <rweeks@newbrightidea.com>
wrote:

> Hi, Dylan,
>
> Yeah, writing RFiles instead of using BatchWriters
> (AccumuloFileOutputFormat vs. AccumuloOutputFormat) for efficiency and
> atomicity of ingest ("improved" atomicity if that even makes sense).
>
> I'm thinking about the NFS gateway just because the system that's
> producing the CSV is kind of a black box to me. It doesn't speak Hadoop, as
> Christopher alluded to, and I can't control its output format, but I can
> direct its output to a filesystem that it perceives to be local.
>
> My options are either an NFS write direct to HDFS via the gateway, or an
> NFS write to a conventional filesystem that I control, followed by some
> sort of inotify-driven migration from that server to HDFS.
>
> -Russ
>
> On Tue, Oct 6, 2015 at 6:12 PM Dylan Hutchison <dhutchis@uw.edu> wrote:
>
>> Hi Russ,
>>   I'm curious what you have in mind.  Are you looking for a solution more
>> efficient than running clients that read the CSV files and open
>> BatchWriters?
>>
>> Regards, Dylan
>>
>> On Tue, Oct 6, 2015 at 4:56 PM, Christopher <ctubbsii@apache.org> wrote:
>>
>>> I haven't tried it, but it sounds like a cool use case. Might be a good
>>> alternative to distcp, more interoperable with tools which don't speak
>>> hadoop.
>>>
>>> On Tue, Oct 6, 2015, 18:41 Russ Weeks <rweeks@newbrightidea.com> wrote:
>>>
>>>> I hope this isn't too off-topic. Any opinions re. its
>>>> completeness/quality/reliability?
>>>>
>>>> (The use case is, CSV files -> NFS -> HDFS -> Spark -> RFiles
->
>>>> Accumulo. Relevance established!)
>>>>
>>>> Thanks,
>>>> -Russ
>>>>
>>>
>>

Mime
View raw message