nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Whelan <>
Subject Re: Problem running Nutch on Win 7 + Cygwin
Date Mon, 12 Dec 2011 04:12:23 GMT
Hi Milan, et al.,

I'm in much the same situation as you. My primary interest with Nutch is
running it on Windows platforms. What I have discovered, is that
installation and configuration of Nutch on Windows is a lot of work, and
that many things can go wrong along the way. In your e-mail, it sounds like
you are encountering problems that tend to occur when using directory names
that include spaces. This is a Windows-specific problem, as UNIX does not
allow spaces in directory names. The simplest solution is to avoid using
directories with spaces in the names.

Well I tend to personally dislike responses to questions that go along the
lines of "you really shouldn't be trying to do what you're doing", I do
have a somewhat interesting suggestion. A while back I had produced a
bundling of Nutch for Windows. This bundling had included an installer
which proceeded to install all the necessary components, and configure them
to work with each other. After several versions of this were released, I
ran into a problem caused by a version upgrade in CygWin. At the time, I
didn't see a way to resolve the issue in a manner that I found acceptable.
The two main problems that I saw were in the impact to the installation
process, and what I believed was a impact to licensing. Since then, after
taking some time away from the problem, I came to the conclusion that I
could solve the technical problems and that I could live with the licensing
impacts. I am currently in the process of producing a bundled installation
of Nutch for Windows, with the goal of providing a simple means for people
to run Nutch on Windows. I'm currently in the stage of development where I
believe I have completed the product, and I am now working on putting
together a website to describe the product in detail. After seeing your
e-mail, I thought this might be a good opportunity to see if there is any
interest in beta testing the product.

The product I have produced is a bundling of Nutch with Solr on Windows.
Additionally, configuration has been done to focus the solution towards
supporting a private network that is looking to provide internal search
capabilities to the sites within that network. The primary target of this
focus is for companies to be able to provide to their employees a search
mechanism for their internal sites. The additional configurations that I
have done fall into the categories of usability and security. In the
usability camp, I have provided a graphical user interface for
administrative configuration, and I have also produced an XSL
transformation on the solar results to display them as a user facing HTML
page. The security aspects have focused on restricting access to Solr's
administrative (admin) and updating (update) functions.The downside, in
terms of licensing, is that this new version of the application is licensed
under GPL 3. Well I tend to dislike the implications of GPL, this should be
a usable license for the target audience, and only become a problem if
someone wanted to sell a product or service based on this work. If there
was interest in obtaining more permissive licensing terms for portions of
this work, I would probably be willing provided that I didn't run afoul of
the demands of the underlying GPL components.

Well not quite yet ready for release, it is very close. As I mentioned, the
main remaining task is to build a site to describe the application, which
will include publishing the installer binary to a reputable freeware site.
Given your e-mail, I thought this might be a good opportunity to kick
things off. Given that I have a pile of vacation to burn before the end of
the year, I anticipate that the official rollout will be sometime between
now and Christmas. The following URL points to a location that I have
placed the installer for this application. If you want, give it a shot. I'd
be interested in knowing what you thought.


On Wed, Dec 7, 2011 at 8:25 PM, Jean-François Gingras <> wrote:

> We currently run Nutch on Windows 7. We have install it under C:\Nutch
> since space in path can lead to the error you mention.
> 2011/11/7 Lewis John Mcgibbney <>
> > Hi Milan,
> >
> > This is coming from someone who has not used Nutch with Windows for a
> good
> > 2 years... so please forgive if info is not accurate. Hopefully we can
> work
> > towards getting it sorted out and update the wiki accordingly.
> >
> > Firstly can you please state which version of Nutch your using? As you
> > mention Tomcat, I assume your using 1.2, however it would be highly
> > recommended to upgrade to atleast 1.3 and use Solr for searching instead
> of
> > using a Lucene index within Tomcat. This is however, obviously down to
> > yourself.
> >
> > Have you set any environment variables in your windows set-up? You can do
> > this by going into control panel, system settings, advanced system
> > settings.
> >
> > Please try the above and post results. I'm not sure how many people are
> > using this set up but it would be nice to get the wiki updated
> > none-the-less.
> >
> > 2011/11/7 Milan Lučanský <>
> >
> > > Hi,
> > >
> > > I set up Nutch on win 7, according to:**
> > > GettingNutchRunningWithWindows<
> >>
> > >
> > > When I type bin/nutch, the usage summary appear, so I assume the Nutch
> is
> > > installed correctly.
> > >
> > > Then I moved on Nutch tutorial on:**
> > > NutchTutorial <>
> > >
> > > But after trying to run command:
> > > bin/nutch crawl urls -dir crawl -depth 3 -topN 5
> > >
> > > I get the following error:
> > > bin/nutch: line 251: exec: C:\Program: not found
> > >
> > > I tried to search for solution but without any success.
> > >
> > > Pleas can you halp me solve this problem? I already noticed, that win
> 7 +
> > > cygwin is not the best combination for Nutch, but I am not experienced
> > > linux/unix user, therefore I preffer little overhead against switching
> to
> > > another operation system.
> > >
> > > I'm running Win 7, cygwin-1.7.9-1, Tomcat 7.0
> > >
> > > Thank you in advance.
> > >
> > > Best regards.
> > > Milan
> > >
> >
> >
> >
> > --
> > *Lewis*
> >
> --
> Jean-François Gingras

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message