nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
Subject Re: problem with the Nutch and Hadoop Tutorial when starting to deploy Nutch to Single Machine
Date Mon, 05 Dec 2011 15:44:34 GMT
OK. So if we briefly discuss the differences between how Nutch is now and
how it used to be.

1) Since Nutch 1.3, it does NOT ship with Hadoop. The has greatly
simplified deployment on ANY type of Hadoop environment.
2) You would set it like you would environment variable just like you would
set it with any other.

export HADOOP_HOME=path/to/hadoop/installation
echo $HADOOP_HOME

What type of errors are you getting? I'm assuming that your running Nutch
1.4 here.

2011/12/5 José Ignacio Ortiz de Galisteo <joseignacio@salir.com>

> Hi Lewis,
>
> thanks for your early reply.
>
> In fact, at this time we want to run nutch to single machine, but still
> don't understand what is going on.
>
> We've followed the tutorial and now we have nutch installed and running
> with solr in local environment correctly. Where can we configure our
> $HADOOP_HOME variable? We did it before in hadoop-env.sh
>
> Let us work a while and we'll let you know when we have something.
>
> Thanks again,
> regards.
>
> 2011/12/5 Lewis John Mcgibbney <lewis.mcgibbney@gmail.com>
>
> > Hi José,
> >
> > If you look at what is generated when you have built Nutch using ant
> > runtime you will see correctly the runtime/local and runtime/deploy
> > folders. To run in deploy mode, it is necessary to specify all of your
> > nutch-site.xml (and any other configuration e.g. filters, plugins etc
> etc)
> > BEFORE you build the project. This means that any subsequent
> modifications
> > you make to your configuration will require you to rebuild the nutch job
> > archive. Once your $HADOOP_HOME (which should be the same for each node
> you
> > have in your cluster) environment variable has been specified then you
> > should be able to run your Nutch classes using
> >
> > hadoop -job nutch-1.4.job path.to.nutch.Class -parameters
> >
> > Regrettably this tutorial is only accurate up-until "Deploy Nutch to
> > Single Machine". It has been on my list of thigns to do for some time
> > :0(
> >
> > Please get back if/when you run into some problems and we will try our
> > best to help out.
> >
> >
> > Lewis
> > 2011/12/5 José Ignacio Ortiz de Galisteo <joseignacio@salir.com>
> >
> > > Hello all.
> > >
> > > We are trying to follow this tutorial:
> > > http://wiki.apache.org/nutch/NutchHadoopTutorial
> > >
> > > We have been testing nutch in local mode. Now we need to deploy to our
> > > production environment. We tried following the tutorial with both 1.3
> and
> > > 1.4, hitting the same problems.
> > >
> > > At the "Deploy Nutch to Single Machine", the tutorial says we have to
> > copy
> > > the files from the nutch build to the deploy directory. We've
> interpreted
> > > this to mean the "build" directory created by running ant, but we can't
> > > find any .sh files in there, and no "config" folder. The Hadoop build
> > > directories contain .sh files, but still no "config" folder. Do we need
> > to
> > > merge these folders, but right now that makes no sense to us, so we
> don't
> > > know how to continue.
> > >
> > > Is the tutorial updated to nutch 1.3?
> > >
> > > These are the commands which seem to require .sh files and a "config"
> > > folder:
> > >
> > > dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop
> > > /nutch/search/bin/nutch
> > > chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop
> > > /nutch/search/bin/nutch
> > > dos2unix /nutch/search/config/*.sh
> > > chmod 700 /nutch/search/config/*.sh
> > >
> > > Thanks for all.
> > > Regards.
> > >
> >
> >
> >
> > --
> > *Lewis*
> >
>



-- 
*Lewis*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message