hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "elangovan anbalahan" <amazing.e...@gmail.com>
Subject Problems While doing Distributed search
Date Fri, 12 Dec 2008 15:05:59 GMT
I am not able to perform distributed search with two machines having indexed
data.

I have crawled data on two machines, one in the server itself and the other
on another linux laptop.

These are the changes i have made :
1)/etc/hosts/
# /etc/hosts (for master AND slave)
192.168.1.106    master
192.168.1.105    slave

2)created a server folder inside TOMCAT_HOME

3)created a search-server.txt file inside that with the following content
master 1234
slave  5678

4)modified nutch-site.xml
    <property>
        <name>searcher.dir</name>
        <value>/usr/share/tomcat6/server/search-server</value>
    </property>

5)On the server I ran this command

bin/nutch server 1234 /path/to/crawledDir

6)On slave i ran this command
bin/nutch server 5678  /path/to/crawledDir

7)I opened http://localhost:8080/nutch-0.9/
and performed search

But it is giving me zero results.


What am i doing wrong in this ??? i have also attached Tomcat logs.

Help !!!!

I checked Tomcat Logs, This is what i have in them
2008-12-12 04:44:30,922 INFO  PluginRepository - Plugins: looking in:
/var/lib/tomcat6/webapps/nutch-0.9/WEB-INF/classes/plugins
2008-12-12 04:44:31,072 INFO  PluginRepository - Plugin Auto-activation
mode: [true]
2008-12-12 04:44:31,072 INFO  PluginRepository - Registered Plugins:
2008-12-12 04:44:31,072 INFO  PluginRepository -     the nutch core
extension points (nutch-extensionpoints)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Basic Query Filter
(query-basic)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Basic URL Normalizer
(urlnormalizer-basic)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Basic Indexing Filter
(index-basic)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Html Parse Plug-in
(parse-html)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Basic Summarizer
Plug-in (summary-basic)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Site Query Filter
(query-site)
2008-12-12 04:44:31,072 INFO  PluginRepository -     HTTP Framework
(lib-http)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Text Parse Plug-in
(parse-text)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Regex URL Filter
(urlfilter-regex)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Pass-through URL
Normalizer (urlnormalizer-pass)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Http Protocol Plug-in
(protocol-http)
2008-12-12 04:44:31,072 INFO  PluginRepository -     Regex URL Normalizer
(urlnormalizer-regex)
2008-12-12 04:44:31,072 INFO  PluginRepository -     OPIC Scoring Plug-in
(scoring-opic)
2008-12-12 04:44:31,072 INFO  PluginRepository -     CyberNeko HTML Parser
(lib-nekohtml)
2008-12-12 04:44:31,072 INFO  PluginRepository -     JavaScript Parser
(parse-js)
2008-12-12 04:44:31,072 INFO  PluginRepository -     URL Query Filter
(query-url)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Regex URL Filter
Framework (lib-regex-filter)
2008-12-12 04:44:31,073 INFO  PluginRepository - Registered
Extension-Points:
2008-12-12 04:44:31,073 INFO  PluginRepository -     Nutch Summarizer
(org.apache.nutch.searcher.Summarizer)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Nutch URL Normalizer
(org.apache.nutch.net.URLNormalizer)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Nutch Protocol
(org.apache.nutch.protocol.Protocol)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Nutch Analysis
(org.apache.nutch.analysis.NutchAnalyzer)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Nutch URL Filter
(org.apache.nutch.net.URLFilter)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Nutch Indexing Filter
(org.apache.nutch.indexer.IndexingFilter)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Nutch Online Search
Results Clustering Plugin (org.apache.nutch.clustering.OnlineClusterer)
2008-12-12 04:44:31,073 INFO  PluginRepository -     HTML Parse Filter
(org.apache.nutch.parse.HtmlParseFilter)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Nutch Content Parser
(org.apache.nutch.parse.Parser)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Nutch Scoring
(org.apache.nutch.scoring.ScoringFilter)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Nutch Query Filter
(org.apache.nutch.searcher.QueryFilter)
2008-12-12 04:44:31,073 INFO  PluginRepository -     Ontology Model Loader
(org.apache.nutch.ontology.Ontology)
2008-12-12 04:44:31,083 INFO  NutchBean - creating new bean
2008-12-12 04:44:31,106 INFO  NutchBean - opening indexes in
/usr/share/tomcat6/server/search-server/indexes
2008-12-12 04:44:31,167 INFO  Configuration - found resource
common-terms.utf8 at
file:/var/lib/tomcat6/webapps/nutch-0.9/WEB-INF/classes/common-terms.utf8
2008-12-12 04:44:31,175 INFO  NutchBean - opening segments in
/usr/share/tomcat6/server/search-server/segments
2008-12-12 04:44:31,189 INFO  SummarizerFactory - Using the first summarizer
extension found: Basic Summarizer
2008-12-12 04:44:31,189 INFO  NutchBean - opening linkdb in
/usr/share/tomcat6/server/search-server/linkdb
2008-12-12 04:44:31,199 INFO  NutchBean - query request from 127.0.0.1
2008-12-12 04:44:31,213 INFO  NutchBean - query: game
2008-12-12 04:44:31,213 INFO  NutchBean - lang: en
2008-12-12 04:44:31,307 INFO  NutchBean - searching for 20 raw hits
2008-12-12 04:44:31,358 INFO  NutchBean - total hits: 0
2008-12-12 04:44:33,528 INFO  NutchBean - query request from 127.0.0.1
2008-12-12 04:44:33,528 INFO  NutchBean - query: game
2008-12-12 04:44:33,528 INFO  NutchBean - lang: en
2008-12-12 04:44:33,530 INFO  NutchBean - searching for 20 raw hits

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message