distributedlog-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerrit Sundaram <gerritsunda...@gmail.com>
Subject Re: FileSystem API over distributedlog logs
Date Sat, 12 Nov 2016 10:24:39 GMT
It would be great if you guys can push it.

- Gerrit

On Fri, Nov 11, 2016 at 12:21 PM, Leigh Stewart <
lstewart@twitter.com.invalid> wrote:

> Sure we could do it. We skipped last time because dl was not OSS.
>
> Need to find some time though - lets discuss quickly next week.
>
> On Fri, Nov 11, 2016 at 12:10 PM, Sijie Guo <sijie@apache.org> wrote:
>
> > /cc Leigh
> >
> > I don't think we pushed the DL related code to kestrel. As I think
> kestrel
> > has been in the deprecation path internally at Twitter. But it might be
> > worth pushing the code change just for reference. Leigh, what's your
> > opinion?
> >
> > - Sijie
> >
> > On Wed, Nov 9, 2016 at 2:48 AM, Gerrit Sundaram <
> gerritsundaram@gmail.com>
> > wrote:
> >
> >> Sijie, thank your for your comments and suggestions. I will start a
> >> separate thread for discussing the metadata operation primitives.
> >>
> >> BTW, I didn't find any code in kestrel that is related to distributedlog
> >> :( Can you kindly point me the files?
> >>
> >> - Gerrit
> >>
> >>
> >> On Wed, Nov 2, 2016 at 10:35 AM, Sijie Guo <sijieg@twitter.com> wrote:
> >>
> >>>
> >>>
> >>> On Wed, Nov 2, 2016 at 3:14 AM, Gerrit Sundaram <
> >>> gerritsundaram@gmail.com> wrote:
> >>>
> >>>> FYI - I tried to use the AppendOnlyStreamWriter and
> >>>> AppendOnlyStreamReader to demonstrate the idea :
> >>>> https://github.com/apache/incubator-distributedlog/pulls/43 Let me
> >>>> know if this is a good direction to go after.
> >>>>
> >>>> - Gerrit
> >>>>
> >>>> On Wed, Nov 2, 2016 at 2:21 AM, Gerrit Sundaram <
> >>>> gerritsundaram@gmail.com> wrote:
> >>>>
> >>>>> Hi distributedlog folks,
> >>>>>
> >>>>> I am new to this community. I am wondering is there anyone tried
to
> >>>>> build a file system over replicated logs. There are a lot of
> similarities
> >>>>> between a filesystem file and a replicated log. You can use files
to
> build
> >>>>> replicated log or use replicated logs to build a filesystem.
> >>>>>
> >>>>> I took at the code repo and found there are two files
> >>>>> 'AppendOnlyStreamReader' and 'AppendOnlyStreamWriter'. They seem
to
> >>>>> implement file I/O related API. Did you guys attempt to provide
> filesystem
> >>>>> API over distributedlog?
> >>>>>
> >>>>
> >>> Ah, those two classes were designed for filesystem-like I/O operations.
> >>> We used them for substituting the local-file-based journal in kestrel
> >>> <https://github.com/twitter-archive/kestrel>.
> >>>
> >>
> >>>
> >>>>
> >>>>> I am wondering if it is possible to build a filesystem over
> >>>>> distributedlog. Would this be an interesting topic to this project
> and the
> >>>>> community? I have two reasons for that
> >>>>> - I can leverage the good stuffs like parallel replication, low
> >>>>> latency for better performance?
> >>>>>
> >>>> - DL uses zookeeper for metadata storage. ZooKeeper has pretty nice
> >>>>> filesystem-like interface. So it would be a nice fit.
> >>>>>
> >>>>
> >>> this sounds interesting. I don't think there are any major blockers for
> >>> DL exposing a filesystem-like API, as indeed we already did that for
> >>> kestrel. You might need to spend time on refining the metadata
> operations,
> >>> like list files, get file status and such.
> >>>
> >>> Re "better performance" - for data I/O, it should be just fine for
> >>> workloads like writes, tailing reads and caught-up reads (scans). I am
> not
> >>> sure about random reads, as we didn't really pay attention to this at
> >>> Twitter (although Salesforce used bookkeeper as the storage for also
> >>> serving random reads, it should probably work just well).  I am not
> certain
> >>> about metadata operations - we did create/open/delete log streams
> >>> frequently for some of our use cases, but still might be less frequent
> >>> comparing to a filesystem. We have a plan to make the stream primitive
> very
> >>> lightweight, so we can support huge number of streams. We probably can
> work
> >>> together on improving the metadata part.
> >>>
> >>> I took a look at your pull request. I liked your layout - putting it in
> >>> a contrib module to incubate this idea. We definitely welcome any
> >>> contributions that make DL easy to use. Feel free to start a proposal
> >>> discussion
> >>> <https://cwiki.apache.org/confluence/display/DL/Project+Proposals>.
I
> >>> believe there will be a lot of corner cases to discuss.
> >>>
> >>
> >>>
> >>>
> >>>>
> >>>>> - Gerrit
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>

Mime
View raw message