Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 25EE910D65 for ; Mon, 1 Dec 2014 14:35:30 +0000 (UTC) Received: (qmail 94578 invoked by uid 500); 1 Dec 2014 14:35:29 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 94522 invoked by uid 500); 1 Dec 2014 14:35:29 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 94512 invoked by uid 99); 1 Dec 2014 14:35:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Dec 2014 14:35:29 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [74.125.82.47] (HELO mail-wg0-f47.google.com) (74.125.82.47) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Dec 2014 14:35:25 +0000 Received: by mail-wg0-f47.google.com with SMTP id n12so14343688wgh.34 for ; Mon, 01 Dec 2014 06:35:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type; bh=fBnR2QV4VgCuONfQVn6BSKGJlQs7v5bDLCJfnSf3psE=; b=S+lkgax0LclfqrB7IfbL6ip39dELhBERValHI+TPtMcelr/AxL90PhoGvYQ+18RHg4 oIDzx/5QBG3dhaRvs4AQ4iuGpDigrDGw54t4PeEAkbBzGT/U5kwXYMck/C4v77RoLm4J IOyzD+1ccP2vCP+hUsNV7WuSCLTZ5IEQxlNCw9WoXfLOUKiis8eydj2N/WLsrTve4xWd rRtAYkjY0Sjttc+aqR3g8KyAtcO16OzrjdCChGUtNpgb90tA3rKrX0/1P0gpp9dHtqWW i5Z1ZfzwDarLIqwGiEVzWJNvtoow2mEnwfZcnHZO32VI58d+UVIyueLa3p67dSIyCQAQ LxLA== X-Gm-Message-State: ALoCoQnMFMDQYZ1Fu31K09uNZCkBAGHApxTeUpA53JU8jB61uI4p84CscdQQrn5AjJRg99RPUC/g X-Received: by 10.194.59.17 with SMTP id v17mr52380336wjq.130.1417444504216; Mon, 01 Dec 2014 06:35:04 -0800 (PST) Received: from ?IPv6:2a01:e34:ec06:8410:18c3:149:7158:ff22? ([2a01:e34:ec06:8410:18c3:149:7158:ff22]) by mx.google.com with ESMTPSA id js5sm41766880wid.11.2014.12.01.06.35.02 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Dec 2014 06:35:03 -0800 (PST) Message-ID: <547C7C96.6010109@target2sell.com> Date: Mon, 01 Dec 2014 15:35:02 +0100 From: Jean-Philippe Caruana User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: user@flume.apache.org Subject: Re: support for Google Storage ? References: <5475FA52.2030208@target2sell.com> In-Reply-To: <5475FA52.2030208@target2sell.com> Content-Type: multipart/alternative; boundary="------------070603010204040403090008" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------070603010204040403090008 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Hi, I managed to write to GS from flume [1], but this is not working 100% yet: - files are created in the expected directories, but are empty - flume throws a java.lang.OutOfMemoryError: Java heap space: java.lang.OutOfMemoryError: Java heap space at java.io.BufferedOutputStream.(BufferedOutputStream.java:76) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.(GoogleHadoopOutputStream.java:79) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.create(GoogleHadoopFileSystemBase.java:820) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773) at org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:96) (complete stack trace here: http://pastebin.com/i5iSgCM3) Has anyone already experienced this ? Is it a bug from google's gcs-connector-latest-hadoop2.jar ? Where should I look to find out what's wrong ? My configuration looks like this: a1.sinks.hdfs_sink.hdfs.path = gs://bucket_name/%{env}/%{tenant}/%{type}/%Y-%m-%d I am running flume from Docker. [1] http://stackoverflow.com/questions/27174033/what-is-the-minimal-setup-needed-to-write-to-hdfs-gs-on-google-cloud-storage-wit Thanks. Le 26/11/2014 17:05, Jean-Philippe Caruana a écrit : > Hi, > > I am a total newbee about hadoop, so sorry if my questions sound > stupid (please give me pointers). > > I would like to use flume to send data to hdfs on google cloud : > - does GS (google storage) support exists ? It would be great to use a > path like this gs://some_path > - where does the flume agent needs to be ? when I see > hdfs://some_path/ I wonder why there is no server address in the path > > In fact I looking for feedback about sending data to a google cloud > hadoop cluster from my own (on premises) servers. > > Thanks > -- > Jean-Philippe Caruana > http://www.barreverte.fr -- Jean-Philippe Caruana http://www.barreverte.fr --------------070603010204040403090008 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit Hi,

I managed to write to GS from flume [1], but this is not working 100% yet:
- files are created in the expected directories, but are empty
- flume throws a java.lang.OutOfMemoryError: Java heap space:

java.lang.OutOfMemoryError: Java heap space
    at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
    at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.<init>(GoogleHadoopOutputStream.java:79)
    at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.create(GoogleHadoopFileSystemBase.java:820)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773)
    at org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:96)

(complete stack trace here: http://pastebin.com/i5iSgCM3)

Has anyone already experienced this ?
Is it a bug from google's gcs-connector-latest-hadoop2.jar ?
Where should I look to find out what's wrong ?

My configuration looks like this:
a1.sinks.hdfs_sink.hdfs.path = gs://bucket_name/%{env}/%{tenant}/%{type}/%Y-%m-%d

I am running flume from Docker.

[1] http://stackoverflow.com/questions/27174033/what-is-the-minimal-setup-needed-to-write-to-hdfs-gs-on-google-cloud-storage-wit

Thanks.


Le 26/11/2014 17:05, Jean-Philippe Caruana a écrit :
Hi,

I am a total newbee about hadoop, so sorry if my questions sound stupid (please give me pointers).

I would like to use flume to send data to hdfs on google cloud :
- does GS (google storage) support exists ? It would be great to use a path like this gs://some_path
- where does the flume agent needs to be ? when I see  hdfs://some_path/ I wonder why there is no server address in the path

In fact I looking for feedback about sending data to a google cloud hadoop cluster from my own (on premises) servers.

Thanks
-- 
Jean-Philippe Caruana 
http://www.barreverte.fr

-- 
Jean-Philippe Caruana 
http://www.barreverte.fr
--------------070603010204040403090008--