storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "P. Taylor Goetz" <ptgo...@gmail.com>
Subject Re: Storm + HDFS
Date Wed, 03 Feb 2016 21:04:03 GMT
Assuming you have git and maven installed:

git clone git@github.com:apache/storm.git
cd storm
git checkout -b 1.x origin/1.x-branch
mvn install -DskipTests

That third step checks out the 1.x-branch branch which is the base for the upcoming 1.0 release.

You can then include the storm-hdfs dependency in your project:

<dependency>
	<groupId>org.apache.storm</groupId>
	<artifactId>storm-hdfs</artifactId>
	<version>1.0.0-SNAPSHOT</version>
</dependency>

You can find more information on using the spout and other HDFS components here:

https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout <https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout>

-Taylor

> On Feb 3, 2016, at 2:54 PM, K Zharas <kgzharas@gmail.com> wrote:
> 
> Oh ok. Can you plz give me an idea how can I do it manually? I'm quite beginner :)
> 
> On Thu, Feb 4, 2016 at 3:43 AM, Parth Brahmbhatt <pbrahmbhatt@hortonworks.com <mailto:pbrahmbhatt@hortonworks.com>>
wrote:
> Storm-hdfs spout is not yet published in maven. You will have to checkout storm locally
and build it to make it available for development.
> 
> From: K Zharas <kgzharas@gmail.com <mailto:kgzharas@gmail.com>>
> Reply-To: "user@storm.apache.org <mailto:user@storm.apache.org>" <user@storm.apache.org
<mailto:user@storm.apache.org>>
> Date: Wednesday, February 3, 2016 at 11:41 AM
> To: "user@storm.apache.org <mailto:user@storm.apache.org>" <user@storm.apache.org
<mailto:user@storm.apache.org>>
> Subject: Re: Storm + HDFS
> 
> Yes, looks like it is. But, I have added dependencies required by storm-hdfs as stated
in a guide.
> 
> On Thu, Feb 4, 2016 at 3:33 AM, Nick R. Katsipoulakis <nick.katsip@gmail.com <mailto:nick.katsip@gmail.com>>
wrote:
> Well,
> 
> those errors look like a problem with the way you build your jar file.
> Please, make sure that you build your jar with the proper storm maven dependency).
> 
> Cheers,
> Nick
> 
> On Wed, Feb 3, 2016 at 2:31 PM, K Zharas <kgzharas@gmail.com <mailto:kgzharas@gmail.com>>
wrote:
> It throws and error that packages does not exist. I have also tried changing org.apache
to backtype, still got an error but only for storm.hdfs.spout. Btw, I use Storm-0.10.0 and
Hadoop-2.7.1
> 
>    package org.apache.storm does not exist
>    package org.apache.storm does not exist
>    package org.apache.storm.generated does not exist
>    package org.apache.storm.metric does not exist
>    package org.apache.storm.topology does not exist
>    package org.apache.storm.utils does not exist
>    package org.apache.storm.utils does not exist
>    package org.apache.storm.hdfs.spout does not exist
>    package org.apache.storm.hdfs.spout does not exist
>    package org.apache.storm.topology.base does not exist
>    package org.apache.storm.topology does not exist
>    package org.apache.storm.tuple does not exist
>    package org.apache.storm.task does not exist
> 
> On Wed, Feb 3, 2016 at 8:57 PM, Matthias J. Sax <mjsax@apache.org <mailto:mjsax@apache.org>>
wrote:
> Storm does provide HdfsSpout and HdfsBolt already. Just use those,
> instead of writing your own spout/bolt:
> 
> https://github.com/apache/storm/tree/master/external/storm-hdfs <https://github.com/apache/storm/tree/master/external/storm-hdfs>
> 
> -Matthias
> 
> 
> On 02/03/2016 12:34 PM, K Zharas wrote:
> > Can anyone help to create a Spout which reads a file from HDFS?
> > I have tried with the code below, but it is not working.
> >
> > public void nextTuple() {
> >       Path pt=new Path("hdfs://localhost:50070/user/BCpredict.txt");
> >       FileSystem fs = FileSystem.get(new Configuration());
> >       BufferedReader br = new BufferedReader(new
> > InputStreamReader(fs.open(pt)));
> >       String line = br.readLine();
> >       while (line != null){
> >          System.out.println(line);
> >          line=br.readLine();
> >          _collector.emit(new Values(line));
> >       }
> > }
> >
> > On Tue, Feb 2, 2016 at 1:19 PM, K Zharas <kgzharas@gmail.com <mailto:kgzharas@gmail.com>
> > <mailto:kgzharas@gmail.com <mailto:kgzharas@gmail.com>>> wrote:
> >
> >     Hi.
> >
> >     I have a project I'm currently working on. The idea is to implement
> >     "scikit-learn" into Storm and integrate it with HDFS.
> >
> >     I've already implemented "scikit-learn". But, currently I'm using a
> >     text file to read and write. However, I need to use HDFS, but
> >     finding it hard to integrate with HDFS.
> >
> >     Here is the link to github
> >     <https://github.com/kgzharas/StormTopologyTest <https://github.com/kgzharas/StormTopologyTest>>.
(I only included
> >     files that I used, not whole project)
> >
> >     Basically, I have a few questions if you don't mint to answer them
> >     1) How to use HDFS to read and write?
> >     2) Is my "scikit-learn" implementation correct?
> >     3) How to create a Storm project? (Currently working in "storm-starter")
> >
> >     These questions may sound a bit silly, but I really can't find a
> >     proper solution.
> >
> >     Thank you for your attention to this matter.
> >     Sincerely, Zharas.
> >
> >
> >
> >
> > --
> > Best regards,
> > Zharas
> 
> 
> 
> 
> --
> Best regards,
> Zharas
> 
> 
> 
> --
> Nick R. Katsipoulakis,
> Department of Computer Science
> University of Pittsburgh
> 
> 
> 
> --
> Best regards,
> Zharas
> 
> 
> 
> --
> Best regards,
> Zharas


Mime
View raw message