nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From El-Glabro <deb...@t-hoster.com>
Subject Re: Nutch War file
Date Tue, 19 Jul 2011 14:18:02 GMT
I have solved my problem with this:
$ bin/nutch readseg -get crawl_urls/segments/XXXXXX 
http://thepagethatyouwanttosee/uri/


On 18/07/11 23:10, Sethi, Parampreet wrote:
> Hey Lewis,
> Thanks for the quick reply. I have setup Nutch with Solr and I am able to
> index the documents in solr server. How can I check the downloaded html
> content? I need to parse the content to fetch rich snippets data from
> various sites.
>
> Also, I have installed hadoop separately on my system and was trying to
> integrate hadoop with Nutch. Is there any tutorial available to do this?
>
> Thanks
> Param
> AIM : parampreetsethi
> Blog: http://param-techie.blogspot.com
>
> On 7/18/11 5:01 PM, "lewis john mcgibbney"<lewis.mcgibbney@gmail.com>
> wrote:
>
>    
>> Simple answer here is no.
>>
>> Both the web app and Lucene index which previously shipped with Nutch has
>> been deprecated.
>>
>> Please have a a look at the new tutorial [1] and the site for more
>> information on the new functionality and features which ship with Nutch 1.3
>>
>> [1] http://wiki.apache.org/nutch/RunningNutchAndSolr
>>
>>
>> On Mon, Jul 18, 2011 at 9:52 PM, Sethi, Parampreet<
>> parampreet.sethi@teamaol.com>  wrote:
>>
>>      
>>> Hi All,
>>>
>>> I downloaded the source code for Nutch 1.3 version. I tried generating war
>>> file using command:
>>> Ant war
>>>
>>> But I am getting error (I checked the build.xml, the war task is indeed
>>> missing.)
>>>
>>> BUILD FAILED
>>> Target "war" does not exist in the project "Nutch".
>>>
>>> Is there any other way to generate nutch.war in 1.3 version?
>>>
>>> Thanks
>>> Param
>>>
>>>
>>>        
>>      
>    


Mime
View raw message