From dev-return-416-archive-asf-public=cust-asf.ponee.io@marvin.apache.org Sat Aug 15 19:03:04 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mailroute1-lw-us.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id 3BB3E18057A for ; Sat, 15 Aug 2020 21:03:04 +0200 (CEST) Received: from mail.apache.org (localhost [127.0.0.1]) by mailroute1-lw-us.apache.org (ASF Mail Server at mailroute1-lw-us.apache.org) with SMTP id 64F771249B8 for ; Sat, 15 Aug 2020 19:03:03 +0000 (UTC) Received: (qmail 23242 invoked by uid 500); 15 Aug 2020 19:03:03 -0000 Mailing-List: contact dev-help@marvin.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@marvin.apache.org Delivered-To: mailing list dev@marvin.apache.org Received: (qmail 23226 invoked by uid 99); 15 Aug 2020 19:03:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 15 Aug 2020 19:03:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 395F41A423B for ; Sat, 15 Aug 2020 19:03:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id RSmnqLJWRn1K for ; Sat, 15 Aug 2020 19:03:00 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.219.172; helo=mail-yb1-f172.google.com; envelope-from=daniel.takabayashi@gmail.com; receiver= Received: from mail-yb1-f172.google.com (mail-yb1-f172.google.com [209.85.219.172]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id E58FCBE369 for ; Sat, 15 Aug 2020 19:02:59 +0000 (UTC) Received: by mail-yb1-f172.google.com with SMTP id m200so7074075ybf.10 for ; Sat, 15 Aug 2020 12:02:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=wjahQ14oi97f8mMHgRx2tr8cfmeJ94QOFEhwi2fnZ54=; b=WYsWESK4SwzicdJqXonmWeR7I1MrWDUxEY8X+falfbtrgoQs4EkpkH+YEpyzgUfcc6 sktW/aMQtztkN++EDi5Y1mPuswuyNpAz51hmmbx+5IzExJSv65dB1pzw2Iw+tR2SdGRE DVm+6Xr/Oa+9fjgHzs8X9Uw3YFkf3z0xidfNuyS50iqrX2C8Wr0barSajo1+Q7cr6n7r rlTlT+NsbtYHx9mTeBK72CWdByle0Bc+BM6A6An+Sv2UhhX+ZnE8ns7q5Yxm/KfAZkb3 PaAiY46PeFN8lQ7berhB0BE+WOcMWHtcNyZar/tlwE8aI48bs81lH+46g1Ih5lJCH0CC K/+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=wjahQ14oi97f8mMHgRx2tr8cfmeJ94QOFEhwi2fnZ54=; b=mUpJ8ARjeS5DY0zTkM059vWd/vt54p6XAZx5s889cvtE8EWeBXCFDK3WFWPiWHKqMi Ia7IQJ/gxB+Jd622aVCgQojxKTkBKP8tDQrxAqtrB+uU2nTw0p8JRalG5ZSS8exkba1t z5DEYEZn8qPDXMwgrbMSLB06Bw5w1cQ3kffUByOmBCuS7hQV3tkrb9MvpQv+JQV8sv38 VmeU1ROUABESKlufwxRec3O/HskJhQh5oqCmTPuQhxrpq61C706yg3QScQDCPs6MY7q0 QNdSH5Hu+DinBvjW65lfwp2CWSIb/Kk91KtwO8ujMftJKt8DfMXym4URLtypyrg5udYq FiKg== X-Gm-Message-State: AOAM533NBad3bpfhRqfrD1S+he0BqlNylmftpwWt+z7/TlvK0XevY0l/ m4/6CYM9jJ9OaS0oTwY1gJ+BLaq5ZVYjn/2x6BbLKl3vduc= X-Google-Smtp-Source: ABdhPJyLyrKVGJFU+bkvykfvQu7ds8kyAHo9mPzPEC+xf2gE5MH7vZpHX9j5zg6bXV5RIFUMZ57jYlMyItq8slpmMSg= X-Received: by 2002:a25:dace:: with SMTP id n197mr10310261ybf.168.1597518172536; Sat, 15 Aug 2020 12:02:52 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Daniel Takabayashi Date: Sat, 15 Aug 2020 12:02:30 -0700 Message-ID: Subject: =?UTF-8?Q?Re=3A_Marvin=E2=80=99s_mission_discussion?= To: dev@marvin.apache.org Content-Type: multipart/alternative; boundary="00000000000032423205acef2fa8" --00000000000032423205acef2fa8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable +1 Em s=C3=A1b., 15 de ago. de 2020 =C3=A0s 08:57, Lucas Bonatto Miguel < lucasbm88@apache.org> escreveu: > It's good, the only thing I would change would be to mention what sort of > applications. Although we have AI in the name, one may mistakenly think > Marvin is intended to serve any type of application. > > Best > > On Fri, Aug 14, 2020 at 11:37 AM Lucas Cardoso Silva < > cardosolucas61.lcs@gmail.com> wrote: > > > Hi guys, > > > > Here comes the summarized Marvin mission: > > > > The Apache Marvin-AI platform aims to offer a practical and standardize= d > > solution to help its users to perform data exploration, model developme= nt > > and application lifecycle management, aiming to offer: scalability, > > language agnosticism and a standardized pipeline. > > > > Thanks for the help, > > Lucas Cardoso > > > > Em qua., 29 de jul. de 2020 =C3=A0s 17:05, Lucas Cardoso Silva < > > cardosolucas61.lcs@gmail.com> escreveu: > > > > > Hi guys! > > > Great Lucas, I will wait a couple of days to see if anyone has other > > > things to add, and then we can close this phase! > > > > > > Wei, we can discuss how to make the data pipelines easier to the user= s > in > > > another step of the evaluation. With the experience of the users and > > > developers with this topic we can track their needs better and make > > > use-case scenarios. I agree with you that data preparation is messy a= nd > > can > > > take a lot of time and will be great if Marvin could help in that. > > > > > > Best regards, > > > Lucas > > > > > > > > > Em qua., 29 de jul. de 2020 =C3=A0s 11:59, Wei Chen > > > escreveu: > > > > > >> Hello Lucas, > > >> > > >> I am thinking of processing JSON or XML files with a hierarchy dynam= ic > > >> structure. > > >> Or building a pipeline to crop image with object detection metadata. > > >> Data preparation can be very messy, > > >> I wonder if we can have a stage to handle both batch and streaming > > >> processing well. > > >> > > >> I simply think we don't need to focus on this part since we can > utilize > > a > > >> wide variety of tools for our specific needs. > > >> > > >> Best Regards, > > >> Wei > > >> > > >> > > >> > > >> On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel < > > >> lucasbm88@apache.org> > > >> wrote: > > >> > > >> > Hi folks, > > >> > > > >> > In regards to the mission, you're correct. If I could summarize it= , > it > > >> > would be like: *to help its users to perform data exploration, mod= el > > >> > development and application lifecycle management*. > > >> > > > >> > I'm all in for having a better integration with Kubernetes. I thin= k > > that > > >> > the first step is to create a new thread in order to design > something > > >> > following their operator pattern: > > >> > https://kubernetes.io/docs/concepts/extend-kubernetes/operator/ > > >> > > > >> > Wei, currently one already can perform merges and joins in the > > >> > transformation step. Could you comment a bit more on what you thin= k > we > > >> > could improve there? Maybe something for a new thread as well? > > >> > > > >> > Best! > > >> > Lucas > > >> > > > >> > On Wed, Jul 29, 2020 at 1:24 AM Wei Chen > wrote: > > >> > > > >> > > I think deploying to K8S does expend our capabilities for > inference > > >> > scaling > > >> > > and managing. > > >> > > I am not familiar with Luigi, but it makes sense since we are > going > > to > > >> > > setup data pipelines. > > >> > > > > >> > > Best Regards, > > >> > > Wei > > >> > > > > >> > > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva < > > >> > > cardosolucas61.lcs@gmail.com> wrote: > > >> > > > > >> > > > Great Wei! I find the suggestions really interesting. I think = we > > can > > >> > work > > >> > > > with the deployment on K8s. The idea of it in Marvin would be, > > after > > >> > > > development, the user would give some parameters and a script > > would > > >> > > > facilitate a deployment in a kubernetes cluster, right? > Regarding > > >> data > > >> > > > acquisition, I think it would be great if we were able to > > integrate > > >> > some > > >> > > > third party library like Luigi. Thanks! > > >> > > > > > >> > > > > > >> > > > > > >> > > > Em qua., 22 de jul. de 2020 =C3=A0s 14:27, Wei Chen < > > weichen@apache.org> > > >> > > > escreveu: > > >> > > > > > >> > > > > Hello Lucas, > > >> > > > > > > >> > > > > I have some ideas: > > >> > > > > > > >> > > > > 1. Should we consider to use K8S or similar tools for > inference > > >> > > container > > >> > > > > scaling and management? > > >> > > > > Marvin's current container management is not as powerful as > some > > >> > > > container > > >> > > > > focus projects. > > >> > > > > K8S can also be deployed into most environments now. > > >> > > > > > > >> > > > > 2. Is our current data cleaning stage flexible enough for > > multiple > > >> > data > > >> > > > > sources with table join? > > >> > > > > Or if we should cut the data preparation stage out for the > user > > to > > >> > make > > >> > > > > their own data pipeline on their data storage. > > >> > > > > I figured that preprocessing might be too complex to be > > >> generalized > > >> > for > > >> > > > > different ML projects. > > >> > > > > > > >> > > > > Best Regards > > >> > > > > Wei > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva < > > >> > > > > cardosolucas61.lcs@gmail.com> wrote: > > >> > > > > > > >> > > > > > Hi guys. > > >> > > > > > I would like to know if anyone else has any ideas about th= is > > >> > > evaluation > > >> > > > > > phase. Both the opinion of those who have been in the > > community > > >> > for a > > >> > > > > long > > >> > > > > > time and those who are still getting to know Marvin is now > > >> > important > > >> > > > for > > >> > > > > > this step, so your suggestion or validation of the initial > > text > > >> is > > >> > > > always > > >> > > > > > welcome! > > >> > > > > > > > >> > > > > > Best regards, > > >> > > > > > Lucas Cardoso > > >> > > > > > > > >> > > > > > Em sex., 10 de jul. de 2020 =C3=A0s 13:48, Lucas Cardoso S= ilva < > > >> > > > > > cardosolucas61.lcs@gmail.com> escreveu: > > >> > > > > > > > >> > > > > > > Hello guys. The time has come for us to take the first > step > > in > > >> > > > > > > architectural assessment: the definition of the mission. > > >> > Basically > > >> > > we > > >> > > > > > have > > >> > > > > > > to decide here what is important in Marvin and what is > > outside > > >> > the > > >> > > > > scope > > >> > > > > > of > > >> > > > > > > the project. This is important because, during this > analysis > > >> and > > >> > > the > > >> > > > > > > development process as a whole, we will be able to segme= nt > > >> what > > >> > is > > >> > > > > really > > >> > > > > > > important and make things more simple and functional. > Also, > > >> if it > > >> > > > looks > > >> > > > > > > cool, we can include that on the Marvin-AI homepage. > > >> > > > > > > > > >> > > > > > > As stated earlier, I will post an initial draft and woul= d > > >> like to > > >> > > > > receive > > >> > > > > > > your feedback to complete a few points: > > >> > > > > > > > > >> > > > > > > The Apache Marvin-AI platform aims to offer: > > >> > > > > > > > > >> > > > > > > - > > >> > > > > > > > > >> > > > > > > a practical and standardized solution, > > >> > > > > > > - > > >> > > > > > > > > >> > > > > > > for the development and deployment of machine learnin= g > > >> > > > applications. > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > Aiming to offer the user: > > >> > > > > > > > > >> > > > > > > - > > >> > > > > > > > > >> > > > > > > scalability, > > >> > > > > > > - > > >> > > > > > > > > >> > > > > > > language agnosticism, > > >> > > > > > > - > > >> > > > > > > > > >> > > > > > > standardized pipeline (DASFE), > > >> > > > > > > - > > >> > > > > > > > > >> > > > > > > possibility of remote versioning of artifacts. > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > Does anyone have any suggestions for more important > > features, > > >> > > > resources > > >> > > > > > or > > >> > > > > > > design decisions in Marvin? > > >> > > > > > > > > >> > > > > > > Thank you very much, > > >> > > > > > > > > >> > > > > > > Lucas Cardoso > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > --00000000000032423205acef2fa8--