marvin-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucas Cardoso Silva <cardosolucas61....@gmail.com>
Subject Re: Marvin’s mission discussion
Date Fri, 14 Aug 2020 14:37:28 GMT
Hi guys,

Here comes the summarized Marvin mission:

The Apache Marvin-AI platform aims to offer a practical and standardized
solution to help its users to perform data exploration, model development
and application lifecycle management, aiming to offer: scalability,
language agnosticism and a standardized pipeline.

Thanks for the help,
Lucas Cardoso

Em qua., 29 de jul. de 2020 às 17:05, Lucas Cardoso Silva <
cardosolucas61.lcs@gmail.com> escreveu:

> Hi guys!
> Great Lucas, I will wait a couple of days to see if anyone has other
> things to add, and then we can close this phase!
>
> Wei, we can discuss how to make the data pipelines easier to the users in
> another step of the evaluation. With the experience of the users and
> developers with this topic we can track their needs better and make
> use-case scenarios. I agree with you that data preparation is messy and can
> take a lot of time and will be great if Marvin could help in that.
>
> Best regards,
> Lucas
>
>
> Em qua., 29 de jul. de 2020 às 11:59, Wei Chen <weichen@apache.org>
> escreveu:
>
>> Hello Lucas,
>>
>> I am thinking of processing JSON or XML files with a hierarchy dynamic
>> structure.
>> Or building a pipeline to crop image with object detection metadata.
>> Data preparation can be very messy,
>> I wonder if we can have a stage to handle both batch and streaming
>> processing well.
>>
>> I simply think we don't need to focus on this part since we can utilize a
>> wide variety of tools for our specific needs.
>>
>> Best Regards,
>> Wei
>>
>>
>>
>> On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel <
>> lucasbm88@apache.org>
>> wrote:
>>
>> > Hi folks,
>> >
>> > In regards to the mission, you're correct. If I could summarize it, it
>> > would be like: *to help its users to perform data exploration, model
>> > development and application lifecycle management*.
>> >
>> > I'm all in for having a better integration with Kubernetes. I think that
>> > the first step is to create a new thread in order to design something
>> > following their operator pattern:
>> > https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
>> >
>> > Wei, currently one already can perform merges and joins in the
>> > transformation step. Could you comment a bit more on what you think we
>> > could improve there? Maybe something for a new thread as well?
>> >
>> > Best!
>> > Lucas
>> >
>> > On Wed, Jul 29, 2020 at 1:24 AM Wei Chen <weichen@apache.org> wrote:
>> >
>> > > I think deploying to K8S does expend our capabilities for inference
>> > scaling
>> > > and managing.
>> > > I am not familiar with Luigi, but it makes sense since we are going to
>> > > setup data pipelines.
>> > >
>> > > Best Regards,
>> > > Wei
>> > >
>> > > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
>> > > cardosolucas61.lcs@gmail.com> wrote:
>> > >
>> > > > Great Wei! I find the suggestions really interesting. I think we can
>> > work
>> > > > with the deployment on K8s. The idea of it in Marvin would be, after
>> > > > development, the user would give some parameters and a script would
>> > > > facilitate a deployment in a kubernetes cluster, right? Regarding
>> data
>> > > > acquisition, I think it would be great if we were able to integrate
>> > some
>> > > > third party library like Luigi. Thanks!
>> > > >
>> > > >
>> > > >
>> > > > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen <weichen@apache.org>
>> > > > escreveu:
>> > > >
>> > > > > Hello Lucas,
>> > > > >
>> > > > > I have some ideas:
>> > > > >
>> > > > > 1. Should we consider to use K8S or similar tools for inference
>> > > container
>> > > > > scaling and management?
>> > > > > Marvin's current container management is not as powerful as some
>> > > > container
>> > > > > focus projects.
>> > > > > K8S can also be deployed into most environments now.
>> > > > >
>> > > > > 2. Is our current data cleaning stage flexible enough for multiple
>> > data
>> > > > > sources with table join?
>> > > > > Or if we should cut the data preparation stage out for the user
to
>> > make
>> > > > > their own data pipeline on their data storage.
>> > > > > I figured that preprocessing might be too complex to be
>> generalized
>> > for
>> > > > > different ML projects.
>> > > > >
>> > > > > Best Regards
>> > > > > Wei
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
>> > > > > cardosolucas61.lcs@gmail.com> wrote:
>> > > > >
>> > > > > > Hi guys.
>> > > > > > I would like to know if anyone else has any ideas about
this
>> > > evaluation
>> > > > > > phase. Both the opinion of those who have been in the community
>> > for a
>> > > > > long
>> > > > > > time and those who are still getting to know Marvin is now
>> > important
>> > > > for
>> > > > > > this step, so your suggestion or validation of the initial
text
>> is
>> > > > always
>> > > > > > welcome!
>> > > > > >
>> > > > > > Best regards,
>> > > > > > Lucas Cardoso
>> > > > > >
>> > > > > > Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva
<
>> > > > > > cardosolucas61.lcs@gmail.com> escreveu:
>> > > > > >
>> > > > > > > Hello guys. The time has come for us to take the first
step in
>> > > > > > > architectural assessment: the definition of the mission.
>> > Basically
>> > > we
>> > > > > > have
>> > > > > > > to decide here what is important in Marvin and what
is outside
>> > the
>> > > > > scope
>> > > > > > of
>> > > > > > > the project. This is important because, during this
analysis
>> and
>> > > the
>> > > > > > > development process as a whole, we will be able to
segment
>> what
>> > is
>> > > > > really
>> > > > > > > important and make things more simple and functional.
Also,
>> if it
>> > > > looks
>> > > > > > > cool, we can include that on the Marvin-AI homepage.
>> > > > > > >
>> > > > > > > As stated earlier, I will post an initial draft and
would
>> like to
>> > > > > receive
>> > > > > > > your feedback to complete a few points:
>> > > > > > >
>> > > > > > > The Apache Marvin-AI platform aims to offer:
>> > > > > > >
>> > > > > > >    -
>> > > > > > >
>> > > > > > >    a practical and standardized solution,
>> > > > > > >    -
>> > > > > > >
>> > > > > > >    for the development and deployment of machine learning
>> > > > applications.
>> > > > > > >
>> > > > > > >
>> > > > > > > Aiming to offer the user:
>> > > > > > >
>> > > > > > >    -
>> > > > > > >
>> > > > > > >    scalability,
>> > > > > > >    -
>> > > > > > >
>> > > > > > >    language agnosticism,
>> > > > > > >    -
>> > > > > > >
>> > > > > > >    standardized pipeline (DASFE),
>> > > > > > >    -
>> > > > > > >
>> > > > > > >    possibility of remote versioning of artifacts.
>> > > > > > >
>> > > > > > >
>> > > > > > > Does anyone have any suggestions for more important
features,
>> > > > resources
>> > > > > > or
>> > > > > > > design decisions in Marvin?
>> > > > > > >
>> > > > > > > Thank you very much,
>> > > > > > >
>> > > > > > > Lucas Cardoso
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message