nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tizy Ninan <tizy1...@gmail.com>
Subject Re: Crawl images and store locally
Date Wed, 25 Mar 2015 04:54:02 GMT
Hi Chris,

Thanks Chris for the reply.
I took the dump of the segment folder. It contains the content of the image
in bytes format.
Thanks a lot.

Regards,
Tizy

On Tue, Mar 24, 2015 at 7:26 PM, Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hi Tizy,
>
> After you crawl the images, take a look at ./bin/nutch dump to
> get the images out. ./bin/nutch commoncrawldumper also will
> dump into the common crawl format.
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Tizy Ninan <tizy1307@gmail.com>
> Reply-To: "dev@nutch.apache.org" <dev@nutch.apache.org>
> Date: Monday, March 23, 2015 at 11:12 PM
> To: "dev@nutch.apache.org" <dev@nutch.apache.org>, "user@nutch.apache.org"
> <user@nutch.apache.org>
> Subject: Crawl images and store locally
>
> >Hi,
> >
> >
> >Does Nutch supports crawling images from webpages? If so, what are the
> >steps to retrieve the images and store it locally?
> >
> >
> >Thanks and Regards,
> >Tizy
> >
> >
> >
> >
> >
> >
> >
> >
>
>


-- 
Thanks and Regards,
Tizy

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message