mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernd Mathiske <be...@mesosphere.io>
Subject Re: Why rely on url scheme for fetching?
Date Tue, 04 Nov 2014 11:13:34 GMT
Hi Ankur, 

I am Bernd, not Ben, but I’ll try to do my best :-)

Your plan looks good to me and your patch for MESOS-1711 seems uncomplicated enough to not
cause me major problems no matter when it lands. 

Bernd

> On Nov 4, 2014, at 11:44 AM, Ankur Chauhan <ankur@malloc64.com> wrote:
> 
> Hi ben,
> 
> Thanks for the follow up. I think thats a perfectly fine game plan. I think I can break
down things into smaller, more isolated chunks. But from the looks of MESOS-1316 the invocation
code and the testing code seems to change a lot (in a good way), so to avoid a bunch of wasted
cycles, I was going to hold off a little till it is committed and then go on to refactoring
the transport parts. The json config change (MESOS-1248) is completely disconnected to this
so I am not too worried about that one.
> The only place where some coordination may be required is the launcher/fetcher.cpp. It
seems there are common changes happening between my  change https://reviews.apache.org/r/27483/
<https://reviews.apache.org/r/27483/><https://reviews.apache.org/r/27483/ <https://reviews.apache.org/r/27483/>>
and your https://reviews.apache.org/r/21316 <https://reviews.apache.org/r/21316><https://reviews.apache.org/r/21316
<https://reviews.apache.org/r/21316>> which would mean a little rework but it's totally
within reasonable limits.
> I think if you get the MESOS-1316 committed and I get MESOS-1711 ( https://reviews.apache.org/r/27483
<https://reviews.apache.org/r/27483> <https://reviews.apache.org/r/27483 <https://reviews.apache.org/r/27483>>
) in, both of us would get unblocked in a way that we can get a way that lets us concurrently
work on MESOS-336 and refactoring fetcher into more testable parts.
> 
> Does this make sense to you or did I misunderstand some part?
> 
> -- Ankur
> 
>> On 4 Nov 2014, at 02:05, Bernd Mathiske <bernd@mesosphere.io <mailto:bernd@mesosphere.io>>
wrote:
>> 
>> Typo: not -> note
>> 
>>> On Nov 4, 2014, at 10:59 AM, Bernd Mathiske <bernd@mesosphere.io <mailto:bernd@mesosphere.io>>
wrote:
>>> 
>>> Hi Ankur,
>>> 
>>> I like it, too. However, I cannot refrain from relaying (not my choice of word
here) to you the advice to break your relatively large patch down into smaller parts. My patch
for MESOS-336 certainly was, as I know now. My plan is to get MESOS-1316 (which is now under
review) and MESOS-1248 (which should be quick) committed and then rebase MESOS-336 in smaller
pieces. If you have small pieces, too, I do think we can do this somewhat concurrently, as
your refactoring seems to affect mostly how the transport part works. I am on the other hand
mostly interested in the bookkeeping of what is in progress, what succeeded, what failed,
what has been put where, and so on, which can be separated, if we are careful about it.
>>> 
>>> (BTW, I think we need both unit tests and integration tests. And we don’t have
nearly enough of either.)
>>> 
>>> Bernd
>>> 
>>>> On Nov 3, 2014, at 6:18 PM, Adam Bordelon <adam@mesosphere.io <mailto:adam@mesosphere.io><mailto:adam@mesosphere.io
<mailto:adam@mesosphere.io>>> wrote:
>>>> 
>>>> + Bernd, who has done some fetcher work, including additional testing, for
MESOS-1316, MESOS-1945, and MESOS-336
>>>> 
>>>> On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon <dhamon@twopensource.com
<mailto:dhamon@twopensource.com><mailto:dhamon@twopensource.com <mailto:dhamon@twopensource.com>>>
wrote:
>>>> Hi Ankur
>>>> 
>>>> I think this is a great approach. It makes the code much simpler, extensible,
and more testable. Anyone that's heard me rant knows I am a big fan of unit tests over integration
tests, so this shouldn't surprise anyone :)
>>>> 
>>>> If you haven't already, please read the documentation on contributing to
Mesos and the style guide to ensure all the naming is as expected, then you can push the patch
to reviewboard to get it reviewed and committed.
>>>> 
>>>> On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan <ankur@malloc64.com <mailto:ankur@malloc64.com><mailto:ankur@malloc64.com
<mailto:ankur@malloc64.com>>> wrote:
>>>> Hi,
>>>> 
>>>> I did some learning today! This is pretty much a very rough draft of the
tests/refactor of mesos-fetcher that I have come up with. Again, If there are some obvious
mistakes, please let me know. (this is my first pass after all).
>>>> https://github.com/ankurcha/mesos/compare/prefer_2 <https://github.com/ankurcha/mesos/compare/prefer_2><https://github.com/ankurcha/mesos/compare/prefer_2
<https://github.com/ankurcha/mesos/compare/prefer_2>>
>>>> 
>>>> My main intention is to break the logic of the fetcher info some very discrete
components that i can write tests against. I am still re-learning cpp/mesos code styles etc
so I may be a little slow to catch up but I would really appreciate any comments and/or suggestions.
>>>> 
>>>> -- Ankur
>>>> @ankurcha
>>>> 
>>>>> On 2 Nov 2014, at 18:17, Ankur Chauhan <ankur@malloc64.com <mailto:ankur@malloc64.com><mailto:ankur@malloc64.com
<mailto:ankur@malloc64.com>>> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I noticed that the current set of tests in `src/tests/fetcher_tests.cpp`
is pretty coarse grained and are more on the lines of a functional test. I was going to add
some tests but it seems like if I am to do that I would need to add a test dependency on hadoop.

>>>>> 
>>>>> As an alternative, I propose adding a good set of unit tests around the
methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This should be able to
catch a good portion of cases at the same time keeping the dependencies and runtime of tests
low. What do you guys thing about this?
>>>>> 
>>>>> PS: I am pretty green in terms of gtest and the overall c++ testing methodology.
Can someone give me pointers to good examples of tests in the codebase.
>>>>> 
>>>>> -- Ankur
>>>>> 
>>>>>> On 1 Nov 2014, at 22:54, Adam Bordelon <adam@mesosphere.io <mailto:adam@mesosphere.io><mailto:adam@mesosphere.io
<mailto:adam@mesosphere.io>>> wrote:
>>>>>> 
>>>>>> Thank you Ankur. At first glance, it looks great. We'll do a more
thorough review of it very soon.
>>>>>> I know Tim St. Clair had some ideas for fixing MESOS-1711 <https://issues.apache.org/jira/browse/MESOS-1711
<https://issues.apache.org/jira/browse/MESOS-1711>>; he may want to review too.
>>>>>> 
>>>>>> On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan <ankur@malloc64.com
<mailto:ankur@malloc64.com><mailto:ankur@malloc64.com <mailto:ankur@malloc64.com>>>
wrote:
>>>>>> Hi Tim,
>>>>>> 
>>>>>> I just created a review https://reviews.apache.org/r/27483/ <https://reviews.apache.org/r/27483/><https://reviews.apache.org/r/27483/
<https://reviews.apache.org/r/27483/>> It's my first stab at it and I will try to
add more tests as I figure out how to do the hadoop mocking and stuff. Have a look and let
me know what you think about it so far.
>>>>>> 
>>>>>> -- Ankur
>>>>>> 
>>>>>>> On 1 Nov 2014, at 20:05, Ankur Chauhan <ankur@malloc64.com
<mailto:ankur@malloc64.com><mailto:ankur@malloc64.com <mailto:ankur@malloc64.com>>>
wrote:
>>>>>>> 
>>>>>>> Yea, i saw that the minute i pressed send. I'll start the review
board so that people can have a look at the change.
>>>>>>> 
>>>>>>> -- Ankur
>>>>>>> 
>>>>>>>> On 1 Nov 2014, at 20:01, Tim Chen <tim@mesosphere.io <mailto:tim@mesosphere.io><mailto:tim@mesosphere.io
<mailto:tim@mesosphere.io>>> wrote:
>>>>>>>> 
>>>>>>>> Hi Ankur,
>>>>>>>> 
>>>>>>>> There is a fetcher_tests.cpp in src/tests.
>>>>>>>> 
>>>>>>>> Tim
>>>>>>>> 
>>>>>>>> On Sat, Nov 1, 2014 at 7:27 PM, Ankur Chauhan <ankur@malloc64.com
<mailto:ankur@malloc64.com><mailto:ankur@malloc64.com <mailto:ankur@malloc64.com>>>
wrote:
>>>>>>>> Hi Tim,
>>>>>>>> 
>>>>>>>> I am trying to find/write some test cases. I couldn't find
a fetcher_tests.{cpp|hpp} so once I have something, I'll post on review board. I am new to
gmock/gtest so bear with me while i get up to speed.
>>>>>>>> 
>>>>>>>> -- Ankur
>>>>>>>> 
>>>>>>>>> On 1 Nov 2014, at 19:23, Timothy Chen <tim@mesosphere.io
<mailto:tim@mesosphere.io><mailto:tim@mesosphere.io <mailto:tim@mesosphere.io>>>
wrote:
>>>>>>>>> 
>>>>>>>>> Hi Ankur,
>>>>>>>>> 
>>>>>>>>> Can you post on reviewboard? We can discuss more about
the code there.
>>>>>>>>> 
>>>>>>>>> Tim
>>>>>>>>> 
>>>>>>>>> Sent from my iPhone
>>>>>>>>> 
>>>>>>>>> On Nov 1, 2014, at 6:29 PM, Ankur Chauhan <ankur@malloc64.com
<mailto:ankur@malloc64.com><mailto:ankur@malloc64.com <mailto:ankur@malloc64.com>>>
wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Tim,
>>>>>>>>>> 
>>>>>>>>>> I don't think there is an issue which is directly
in line with what i wanted but the closest one that I could find in JIRA is https://issues.apache.org/jira/browse/MESOS-1711
<https://issues.apache.org/jira/browse/MESOS-1711><https://issues.apache.org/jira/browse/MESOS-1711
<https://issues.apache.org/jira/browse/MESOS-1711>>
>>>>>>>>>> 
>>>>>>>>>> I have a branch ( https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher
<https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher><https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher
<https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher>> ) that has a change
that would enable users to specify whatever hdfs compatible uris to the mesos-fetcher but
maybe you can weight in on it. Do you think this is the right track? if so, i would like to
pick this issue and submit a patch for review.
>>>>>>>>>> 
>>>>>>>>>> -- Ankur
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On 1 Nov 2014, at 04:32, Tom Arnfeld <tom@duedil.com
<mailto:tom@duedil.com><mailto:tom@duedil.com <mailto:tom@duedil.com>>>
wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Completely +1 to this. There are now quite a
lot of hadoop compatible filesystem wrappers out in the wild and this would certainly be very
useful.
>>>>>>>>>>> 
>>>>>>>>>>> I'm happy to contribute a patch. Here's a few
related issues that might be of interest;
>>>>>>>>>>> 
>>>>>>>>>>> - https://issues.apache.org/jira/browse/MESOS-1887
<https://issues.apache.org/jira/browse/MESOS-1887><https://issues.apache.org/jira/browse/MESOS-1887
<https://issues.apache.org/jira/browse/MESOS-1887>>
>>>>>>>>>>> - https://issues.apache.org/jira/browse/MESOS-1316
<https://issues.apache.org/jira/browse/MESOS-1316><https://issues.apache.org/jira/browse/MESOS-1316
<https://issues.apache.org/jira/browse/MESOS-1316>>
>>>>>>>>>>> - https://issues.apache.org/jira/browse/MESOS-336
<https://issues.apache.org/jira/browse/MESOS-336><https://issues.apache.org/jira/browse/MESOS-336
<https://issues.apache.org/jira/browse/MESOS-336>>
>>>>>>>>>>> - https://issues.apache.org/jira/browse/MESOS-1248
<https://issues.apache.org/jira/browse/MESOS-1248><https://issues.apache.org/jira/browse/MESOS-1248
<https://issues.apache.org/jira/browse/MESOS-1248>>
>>>>>>>>>>> 
>>>>>>>>>>> On 31 October 2014 22:39, Tim Chen <tim@mesosphere.io
<mailto:tim@mesosphere.io><mailto:tim@mesosphere.io <mailto:tim@mesosphere.io>>>
wrote:
>>>>>>>>>>> I believe there is already a JIRA ticket for
this, if you search for fetcher in Mesos JIRA I think you can find it.
>>>>>>>>>>> 
>>>>>>>>>>> Tim
>>>>>>>>>>> 
>>>>>>>>>>> On Fri, Oct 31, 2014 at 3:27 PM, Ankur Chauhan
<ankur@malloc64.com <mailto:ankur@malloc64.com><mailto:ankur@malloc64.com <mailto:ankur@malloc64.com>>>
wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> I have been looking at some of the stuff around
the fetcher and saw something interesting. The code for fetcher::fetch method is dependent
on a hard coded list of url schemes. No doubt that this works but is very restrictive.
>>>>>>>>>>> Hadoop/HDFS in general is pretty flexible when
it comes to being able to fetch stuff from urls and has the ability to fetch a large number
of types of urls and can be extended by adding configuration into the conf/hdfs-site.xml and
core-site.xml
>>>>>>>>>>> 
>>>>>>>>>>> What I am proposing is that we refactor the fetcher.cpp
to prefer to use the hdfs (using hdfs/hdfs.hpp) to do all the fetching if HADOOP_HOME is set
and $HADOOP_HOME/bin/hadoop is available. This logic already exists and we can just use it.
The fallback logic for using net::download or local file copy is may be left in place for
installations that do not have hadoop configured. This means that if hadoop is present we
can directly fetch urls such as tachyon://... snackfs:// ... cfs:// .... ftp://... s3://...
http:// ... file:// with no extra effort. This makes up for a much better experience when
it comes to debugging and extensibility.
>>>>>>>>>>> 
>>>>>>>>>>> What do others think about this?
>>>>>>>>>>> 
>>>>>>>>>>> - Ankur
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Dominic Hamon | @mrdo | Twitter
>>>> There are no bad ideas; only good ideas that go horribly wrong.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message