Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8C9EF200C4C for ; Tue, 4 Apr 2017 19:55:45 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 8B3B7160BA2; Tue, 4 Apr 2017 17:55:45 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D2FB3160B77 for ; Tue, 4 Apr 2017 19:55:44 +0200 (CEST) Received: (qmail 13993 invoked by uid 500); 4 Apr 2017 17:55:44 -0000 Mailing-List: contact commits-help@beam.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@beam.apache.org Delivered-To: mailing list commits@beam.apache.org Received: (qmail 13983 invoked by uid 99); 4 Apr 2017 17:55:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Apr 2017 17:55:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 8B48718030D for ; Tue, 4 Apr 2017 17:55:43 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id BTgAQli_kByo for ; Tue, 4 Apr 2017 17:55:42 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 612B25F1A0 for ; Tue, 4 Apr 2017 17:55:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id E9414E09D6 for ; Tue, 4 Apr 2017 17:55:41 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A656F2401A for ; Tue, 4 Apr 2017 17:55:41 +0000 (UTC) Date: Tue, 4 Apr 2017 17:55:41 +0000 (UTC) From: "Stephen Sisk (JIRA)" To: commits@beam.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (BEAM-1878) IO ITs: how to handle custom docker images? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 04 Apr 2017 17:55:45 -0000 Stephen Sisk created BEAM-1878: ---------------------------------- Summary: IO ITs: how to handle custom docker images? Key: BEAM-1878 URL: https://issues.apache.org/jira/browse/BEAM-1878 Project: Beam Issue Type: Improvement Components: sdk-java-extensions Reporter: Stephen Sisk Assignee: Stephen Sisk Summary: For IO ITs that use data stores that need custom docker images in order to run, we can't currently use them in a kubernetes cluster (which is where we host our data stores.) I have a couple options for how to solve this and am looking for feedback from folks involved in creating IO ITs/opinions on kubernetes. Details: We've discussed in the past that we'll want to allow developers to submit just a dockerfile, and then we'll use that when creating the data store on kubernetes. This is the case for ElasticsearchIO and I assume more data stores in the future will want to do this. It's also looking like it'll be necessary to use custom docker images for the HadoopInputFormatIO's cassandra ITs - to run a cassandra cluster, there doesn't seem to be a good image you can use out of the box. In either case, in order to retrieve a docker image, kubernetes needs a container registry - it will read the docker images from there. A simple private container registry doesn't work because kubernetes config files are static - this means that if local devs try to use the kubernetes files, they point at the private container registry and they wouldn't be able to retrieve the images since they don't have access. They'd have to manually edit the files, which in theory is an option, but I don't consider that to be acceptable since it feels pretty unfriendly (it is simple, so if we really don't like the below options we can revisit it.) Quick summary of the options ======================= We can: * Start using something like k8 helm - this adds more dependencies, adds a small amount of complexity (this is my recommendation, but only by a little) * Start pushing images to docker hub - this means they'll be publicly visible and raises the bar for maintenance of those images * Host our own public container registry - this means running our own public service with costs, etc.. I discussed the options in detail in my original email to dev@: https://lists.apache.org/thread.html/ca53c338209a2120d710e2e775fce384c6b68dd7f207a807efa2534b@%3Cdev.beam.apache.org%3E I ran into this question while working on getting the HIFIO cassandra cluster running, so I might prototype with that. -- This message was sent by Atlassian JIRA (v6.3.15#6346)