distributedlog-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerrit Sundaram <gerritsunda...@gmail.com>
Subject Re: [DISCUSS] DL Stream Operation Primitives
Date Sat, 12 Nov 2016 10:30:20 GMT
On Fri, Nov 11, 2016 at 1:09 PM, Sijie Guo <sijie@apache.org> wrote:

> I liked this topic. A better name might be 'stream storage primitives', as
> we treat DL as a stream storage. Comments inline.
>
> On Wed, Nov 9, 2016 at 3:09 AM, Gerrit Sundaram <gerritsundaram@gmail.com>
> wrote:
>
> > As what Sijie suggested in the other email thread, I started this email
> > thread for discussing the stream operation primitives.
> >
> > The stream operations that I am aware of that DL supports are
> >
> > * Open a distributedlog stream
> > * Delete a distributedlog stream
> > * List all the distributedlog streams under a namespace
> >
>
> Are you also looking for listing streams under a 'sub-namespace' - (or
> streams have common prefix)? (Based on my understanding on your proposal,
> you might need this for a filesystem-like API?)
>

Yes. However it seems like DL is more designed with flat namespace with
just streams. There is no concept about 'sub-namespace'. Although I
probably can hack it by just naming the stream names in a filesystem
path-like way.

However I am still curious do you guys want to introduce any sort of naming
hierarchy in the naming within a namespace. For example, can you have a
'StreamSet', which is a set of streams? (like in filesystem, a directory
has a list of children). If you have similar hierarchical, it definitely
will simply my work.


>
>
> > * Seal a distributedlog stream
> > * Truncate a distributedlog stream
> >
>
> Just to clarify this, the 'truncate' in DL is to trim the head of the
> stream not the tail.
> The 'truncate' in filesystem world is to a size of precisely *length*
> bytes, it is truncating the tail.
>
> Make sure we clarified it and are on same page.
>

Yes, we are on the same page.


>
>
> >
> > I am looking for a more filesystem-like API. for example,
> >
> > * Get the status/attributes of a stream (like stat in filesystem)
> >
>
> +1 for stream status/attributes. I think we might actually already have
> this in DL. since in kestrel, we use that for storing customized metadata.
> It might make sense to formalize it into 'stream status'.
>

Gotcha.


>
>
> > * Rename a stream
> >
>
> we've talked about this for a while. +1.
>
>
> > * Symlink a stream
>
>
> Symlink a stream is probably easy to do. +1 we've thought about that for
> having the flexibility to move stream between different storage backend.
> Symlink would help this.
>
> But a more fundamental thought here is symlinks for log segments. So when a
> symlinked stream is deleted, the underneath log segments might not be
> deleted until its link count decreased to zero.
>
>
>
> >
> > Another operations that I can think of might be useful.
> >
> > * Split/Fork a stream (it can be useful for dynamic data partitioning)
> >
>
>
>
> Split and fork a stream sounds interesting. But it sounds like a more
> high-level feature rather than storage primitives. Actually, it might be a
> good separate discussion feature.
>
>
>
>
> > * Merge/Concat streams
> >
>
>
> I think there is already one outstanding jira for concatenating two DL
> streams. Jia and Arvind are working on that.
>
> https://issues.apache.org/jira/browse/DL-46


I will watch that lira.


>
>
>
>
> >
> > The above operations are based on my knowledge about DL. Feel free to add
> > more.
>
>
> >
> > - Gerrit
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message