hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Wang <andrew.w...@cloudera.com>
Subject Re: fsck output compatibility question with regard to HDFS-7281
Date Tue, 28 Apr 2015 18:54:17 GMT
On Tue, Apr 28, 2015 at 11:25 AM, Allen Wittenauer <aw@altiscale.com> wrote:

> On Apr 28, 2015, at 10:59 AM, Andrew Wang <andrew.wang@cloudera.com>
> wrote:
> >
> > This is also not something typically upheld by unix-y commands. BSD vs.
> > already leads to incompatible flags and output. Most of these commands
> > haven't been changed in 20 years, but that doesn't constitute a compat
> > guarantee.
>         One of the reasons why Solaris doesn’t officially support user
> names greater than 8 characters is because of the breakage to ls and what
> that would do with how one parses them.  So, yes, it is upheld in cases
> where it would be too big of a burden on backward compatibility.  (That’s
> the easy example, I could give a lot more from my days at Sun if you’d
> like.)
> Yup, agree. I know HFS and NTFS support case-insensitivity for similar
reasons. Not sure changing fsck or dfsadmin is quite the same level though

> > This is something I'd like to follow for our own commands. We provide
> > different APIs for machine consumption vs. human consumption, and make
> this
> > clear in the compat guide. Of course, we should still be judicious when
> > changing the human output, but I just don't see a good way forward
> without
> > relaxing our current compat guidelines.
>         I think that’s a great suggestion.
> Allen, do you have a "top 3" for shell commands that need the "plumbing"
treatment? That'd be a good place to start. Yongjun expressed some interest
to me in working on this, and I think it'd be a great place for new
contributors too. We can probably crib ideas from what git did. Once that's
in place, we can think about changing this part of the compat guidelines.

> > The other thing to consider is providing supported Java APIs for the
> > commonly-parsed shell commands. This is something we have much more
> > experience with.
>         I think people forget about who the customer of some of these
> interfaces actually are.  I can probably count the number of ops people I
> know who speak Java frequently enough to be comfortable with it for every
> day use on two hands. In this particular case, fsck is, by and far, an ops
> tool.  Give us perl and/or python and/or ruby bindings.  That was the
> promise of protobuf, right?  But Java? Yeah, no thanks, I’ll continue
> processing it from stdin with a couple lines of perl than deal with the
> mountains of Java cruft.

My idea behind providing a Java API is for monitoring tools (i.e. CM,
Ambari). I suspect some of the info available in shell commands is not also
available through another API, which forces tools that are okay with Java
to instead parse shell output. We still don't have a python / etc client
that doesn't wrap a JVM, so alternate language bindings are tough right now.

In terms of the ops experience, I'm hoping that "plumbing" (which will be
more difficult to use) will meet needs for long-lived scripts, while
"porcelain" will be okay for adhoc one-off usage. This makes porcelain more
okay to break, since I rewrite my grep/cut/awk pipelines each time anyway.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message