mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trevor Grant <trevor.d.gr...@gmail.com>
Subject Re: Location of JARs
Date Thu, 02 Jun 2016 18:00:32 GMT
I agree and have been thinking so more and more over the last couple of
days.

I'm going to start tinkering with that idea this afternoon / remainder of
week.



Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Thu, Jun 2, 2016 at 12:23 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> i already looked. my main concern is that it meddles with spark interpreter
> code too much which may create friction with spark interpreters in future.
> it may be hard to have two products integration code coherent in one
> component (in this case, the same interpreter class/file). I don't want to
> put this comment to zeppelin discussion, but internally i think it should
> be a concern for us.
>
> Is it possible to have a standalone mahout-spark interpreter but use the
> same spark configuration as configured for spark interpreter? If yes, i
> would very much like not to have spark-alone and spark+mahout code
> intermingled in same interpreter class.
>
> visually, it probably also would be preferable to have a block that would
> require boiler of something like
>
> %spark.mahout
>
> ... blah ....
>
> On Thu, Jun 2, 2016 at 8:24 AM, Trevor Grant <trevor.d.grant@gmail.com>
> wrote:
>
> > Would you mind having a look at
> > https://github.com/apache/incubator-zeppelin/pull/928/files
> > to see if I'm missing anything critical.
> >
> > The idea is the user specifies a directory containing the necessary (to
> be
> > covered in the setup documentation), and the jars are loaded from there.
> > Also adds some configuration settings (mainly Kyro) when 'spark.mahout'
> is
> > true.  Finally imports the mahout and sets up the sdc from the already
> > declared sc.
> >
> > Based on my testing that works in local and cluster mode.
> >
> > Thanks,
> > tg
> >
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> >
> >
> > On Wed, Jun 1, 2016 at 12:48 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> > wrote:
> >
> > > On Wed, Jun 1, 2016 at 10:46 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> > > wrote:
> > >
> > > >
> > > >
> > > > On Wed, Jun 1, 2016 at 7:47 AM, Trevor Grant <
> trevor.d.grant@gmail.com
> > >
> > > > wrote:
> > > >
> > > >>
> > > >> Other approaches?
> > > >>
> > > >> For background, Zeppelin starts a Spark Shell and we need to make
> sure
> > > all
> > > >> of the required Mahout jars get loaded in the class path when spark
> > > >> starts.
> > > >> The question is where do all of these JARs relatively live.
> > > >>
> > > >
> > > > How does zeppelin copes with extra dependencies for other
> interpreters
> > > > (even spark itself)? I guess we should follow the same practice
> there.
> > > >
> > > > Release independence of location algorithm largely depends on jar
> > filters
> > > > (again, see filters in the spark binding package). It is possible
> that
> > > > artifacts required may change but not very likely (i don't think they
> > > ever
> > > > changed since 0.10). so it should be possible to build (mahout)
> > > > release-independent logic to locate, filter and assert the necessary
> > > jars.
> > > >
> > >
> > > PS this  may change soon though if/when custom javacpp code is built,
> we
> > > may probably want to keep all native things as separate release
> > artifacts,
> > > as they are basically treated as optionally available accellerators and
> > may
> > > or may not be properly loaded in all situations. hence they may
> warrant a
> > > seaprate jar vehicle.
> > >
> > > >
> > > >
> > > >>
> > > >> Thanks for any feedback,
> > > >> tg
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> Trevor Grant
> > > >> Data Scientist
> > > >> https://github.com/rawkintrevo
> > > >> http://stackexchange.com/users/3002022/rawkintrevo
> > > >> http://trevorgrant.org
> > > >>
> > > >> *"Fortunate is he, who is able to know the causes of things."
> > -Virgil*
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message