lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kai_testing Middleton <kai_test...@yahoo.com>
Subject Re: StandardAnalyzer vs KeywordAnalyzer in Luke
Date Fri, 10 Aug 2007 22:40:43 GMT
The nutch analyzer is NutchDocumentAnalyzer.  Does anyone know how to add this to the Luke
classpath?  I tried this kind of thing but it didn't work:

note that the last line is
   java -jar lukeall-0.7.1.jar


export CLASSPATH=$NUTCH_HOME/lib/jetty-ext/ant.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/jetty-ext/commons-el.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/jetty-ext/jasper-compiler.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/jetty-ext/jasper-runtime.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/jetty-ext/jsp-api.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/pmd-ext/jakarta-oro-2.0.8.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/pmd-ext/jaxen-1.1-beta-7.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/pmd-ext/pmd-3.6.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/commons-cli-2.0-SNAPSHOT.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/commons-codec-1.3.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/commons-httpclient-3.0.1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/commons-lang-2.1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/commons-logging-1.0.4.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/commons-logging-api-1.0.4.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/hadoop-0.12.3-core.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/jakarta-oro-2.0.7.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/jets3t-0.5.0.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/jetty-5.1.4.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/junit-3.8.1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/log4j-1.2.13.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/lucene-core-2.2.0.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/lucene-misc-2.2.0.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/servlet-api.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/taglibs-i18n.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/xerces-2_6_2-apis.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/lib/xerces-2_6_2.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/clustering-carrot2/Jama-1.0.1-patched.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/clustering-carrot2/carrot2-filter-lingo.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/clustering-carrot2/carrot2-local-core.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/clustering-carrot2/carrot2-snowball-stemmers.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/clustering-carrot2/carrot2-util-common.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/clustering-carrot2/carrot2-util-tokenizer.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/clustering-carrot2/clustering-carrot2.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/clustering-carrot2/commons-collections-3.1-patched.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/clustering-carrot2/commons-pool-1.1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/clustering-carrot2/violinstrings-1.0.2.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/creativecommons/creativecommons.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/feed/feed.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/feed/rome-0.9.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/index-basic/index-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/index-more/index-more.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/language-identifier/language-identifier.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-http/lib-http.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-jakarta-poi/poi-3.0-alpha1-20050704.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-jakarta-poi/poi-scratchpad-3.0-alpha1-20050704.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-lucene-analyzers/lucene-analyzers-2.2.0.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-nekohtml/nekohtml-0.9.4.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-parsems/lib-parsems.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-regex-filter/lib-regex-filter.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-xml/jaxen-core.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-xml/jaxen-jdom.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-xml/jdom.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-xml/saxpath.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/lib-xml/xercesImpl.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/microformats-reltag/microformats-reltag.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/nutch-extensionpoints/nutch-extensionpoints.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/ontology/icu4j_2_6_1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/ontology/jena-2.1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/ontology/ontology.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-ext/parse-ext.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-html/parse-html.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-html/tagsoup-1.0rc3.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-js/parse-js.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-msexcel/parse-msexcel.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-msword/parse-msword.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-mspowerpoint/parse-mspowerpoint.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-oo/parse-oo.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-pdf/PDFBox-0.7.2-log4j.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-pdf/parse-pdf.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-rss/commons-feedparser-0.6-fork.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-rss/parse-rss.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-rss/xmlrpc-1.2.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-swf/javaswf.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-swf/parse-swf.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-text/parse-text.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/parse-zip/parse-zip.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/protocol-file/protocol-file.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/protocol-ftp/commons-net-1.2.0-dev.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/protocol-ftp/protocol-ftp.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/protocol-http/protocol-http.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/protocol-httpclient/protocol-httpclient.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/query-basic/query-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/query-more/query-more.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/query-site/query-site.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/query-url/query-url.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/scoring-opic/scoring-opic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/subcollection/subcollection.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/summary-basic/summary-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/summary-lucene/lucene-highlighter-2.2.0.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/summary-lucene/summary-lucene.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/urlfilter-automaton/automaton.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/urlfilter-automaton/urlfilter-automaton.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/urlfilter-prefix/urlfilter-prefix.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/urlfilter-regex/urlfilter-regex.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/urlfilter-suffix/urlfilter-suffix.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/urlnormalizer-basic/urlnormalizer-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/urlnormalizer-pass/urlnormalizer-pass.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/plugins/urlnormalizer-regex/urlnormalizer-regex.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/clustering-carrot2/lib/Jama-1.0.1-patched.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/clustering-carrot2/lib/carrot2-filter-lingo.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/clustering-carrot2/lib/carrot2-local-core.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/clustering-carrot2/lib/carrot2-snowball-stemmers.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/clustering-carrot2/lib/carrot2-util-common.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/clustering-carrot2/lib/carrot2-util-tokenizer.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/clustering-carrot2/lib/commons-collections-3.1-patched.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/clustering-carrot2/lib/commons-pool-1.1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/clustering-carrot2/lib/violinstrings-1.0.2.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/feed/lib/rome-0.9.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/lib-jakarta-poi/lib/poi-3.0-alpha1-20050704.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/lib-jakarta-poi/lib/poi-scratchpad-3.0-alpha1-20050704.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/lib-lucene-analyzers/lib/lucene-analyzers-2.2.0.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/lib-nekohtml/lib/nekohtml-0.9.4.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/lib-xml/lib/jaxen-core.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/lib-xml/lib/jaxen-jdom.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/lib-xml/lib/jdom.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/lib-xml/lib/saxpath.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/lib-xml/lib/xercesImpl.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/ontology/lib/icu4j_2_6_1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/ontology/lib/jena-2.1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/parse-html/lib/tagsoup-1.0rc3.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/parse-pdf/lib/PDFBox-0.7.2-log4j.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/parse-rss/lib/commons-feedparser-0.6-fork.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/parse-rss/lib/xmlrpc-1.2.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/parse-swf/lib/javaswf.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/protocol-ftp/lib/commons-net-1.2.0-dev.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/summary-lucene/lib/lucene-highlighter-2.2.0.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/src/plugin/urlfilter-automaton/lib/automaton.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/nutch-2007-06-27_06-52-44.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/clustering-carrot2/clustering-carrot2.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-nekohtml/nekohtml-0.9.4.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-nekohtml/nekohtml-0.9.4.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/nutch-extensionpoints/nutch-extensionpoints.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/clustering-carrot2/clustering-carrot2.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/clustering-carrot2/Jama-1.0.1-patched.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/clustering-carrot2/commons-collections-3.1-patched.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/clustering-carrot2/carrot2-util-common.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/clustering-carrot2/violinstrings-1.0.2.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/clustering-carrot2/carrot2-util-tokenizer.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/clustering-carrot2/carrot2-filter-lingo.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/clustering-carrot2/carrot2-local-core.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/clustering-carrot2/carrot2-snowball-stemmers.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/clustering-carrot2/commons-pool-1.1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-html/parse-html.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-html/tagsoup-1.0rc3.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/creativecommons/creativecommons.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/protocol-file/protocol-file.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/feed/feed.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/feed/rome-0.9.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/index-basic/index-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/index-more/index-more.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/language-identifier/language-identifier.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-http/lib-http.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-jakarta-poi/poi-3.0-alpha1-20050704.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-jakarta-poi/poi-scratchpad-3.0-alpha1-20050704.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-lucene-analyzers/lucene-analyzers-2.2.0.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-parsems/lib-parsems.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-regex-filter/lib-regex-filter.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-xml/jaxen-core.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-xml/jdom.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-xml/xercesImpl.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-xml/saxpath.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/lib-xml/jaxen-jdom.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/microformats-reltag/microformats-reltag.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/ontology/ontology.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/ontology/jena-2.1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/ontology/icu4j_2_6_1.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/protocol-ftp/protocol-ftp.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/protocol-ftp/commons-net-1.2.0-dev.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/protocol-http/protocol-http.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/protocol-httpclient/protocol-httpclient.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-ext/parse-ext.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-js/parse-js.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-msexcel/parse-msexcel.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-mspowerpoint/parse-mspowerpoint.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-msword/parse-msword.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-oo/parse-oo.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-pdf/parse-pdf.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-pdf/PDFBox-0.7.2-log4j.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-rss/parse-rss.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-rss/xmlrpc-1.2.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-rss/commons-feedparser-0.6-fork.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-swf/parse-swf.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-swf/javaswf.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-text/parse-text.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/parse-zip/parse-zip.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/query-basic/query-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/query-more/query-more.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/query-site/query-site.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/query-url/query-url.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/scoring-opic/scoring-opic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/summary-basic/summary-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/subcollection/subcollection.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/summary-lucene/summary-lucene.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/summary-lucene/lucene-highlighter-2.2.0.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/urlfilter-automaton/urlfilter-automaton.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/urlfilter-automaton/automaton.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/urlfilter-prefix/urlfilter-prefix.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/urlfilter-regex/urlfilter-regex.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/urlfilter-suffix/urlfilter-suffix.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/urlnormalizer-basic/urlnormalizer-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/urlnormalizer-pass/urlnormalizer-pass.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/plugins/urlnormalizer-regex/urlnormalizer-regex.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/nutch-extensionpoints/nutch-extensionpoints.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/creativecommons/creativecommons.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-html/parse-html.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/feed/feed.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-xml/jaxen-core.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-xml/jdom.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-xml/xercesImpl.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-xml/saxpath.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-xml/jaxen-jdom.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/protocol-file/protocol-file.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/index-basic/index-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/index-more/index-more.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/language-identifier/language-identifier.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-http/lib-http.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-jakarta-poi/poi-3.0-alpha1-20050704.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-jakarta-poi/poi-scratchpad-3.0-alpha1-20050704.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-lucene-analyzers/lucene-analyzers-2.2.0.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-parsems/lib-parsems.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/lib-regex-filter/lib-regex-filter.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/microformats-reltag/microformats-reltag.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/ontology/ontology.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/protocol-ftp/protocol-ftp.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/protocol-http/protocol-http.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/protocol-httpclient/protocol-httpclient.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-ext/parse-ext.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-js/parse-js.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-msexcel/parse-msexcel.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-mspowerpoint/parse-mspowerpoint.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-msword/parse-msword.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-oo/parse-oo.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-pdf/parse-pdf.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-rss/parse-rss.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-swf/parse-swf.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-text/parse-text.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/parse-zip/parse-zip.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/query-basic/query-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/query-more/query-more.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/query-site/query-site.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/query-url/query-url.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/scoring-opic/scoring-opic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/summary-basic/summary-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/subcollection/subcollection.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/summary-lucene/summary-lucene.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/urlfilter-automaton/urlfilter-automaton.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/urlfilter-prefix/urlfilter-prefix.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/urlfilter-regex/urlfilter-regex.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/urlfilter-suffix/urlfilter-suffix.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/urlnormalizer-basic/urlnormalizer-basic.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/urlnormalizer-pass/urlnormalizer-pass.jar
export CLASSPATH=$CLASSPATH:$NUTCH_HOME/build/urlnormalizer-regex/urlnormalizer-regex.jar
java -jar lukeall-0.7.1.jar


----- Original Message ----
From: Grant Ingersoll <gsingers@apache.org>
To: java-user@lucene.apache.org
Sent: Tuesday, August 7, 2007 5:51:43 PM
Subject: Re: StandardAnalyzer vs KeywordAnalyzer in Luke

Nutch uses it's own Analyzer.  You should use the Analyzer that Nutch  
uses in order to get proper results.  That may mean adding the Nutch  
Analyzer to your Luke classpath.

-Grant

On Aug 7, 2007, at 7:22 PM, Kai_testing Middleton wrote:

> I'm invoking Luke like this:
>    java -jar lukeall-0.7.1.jar
> I run this query:
>    content:Nyarubuye
>
> When I use the StandardAnalyzer I get results but when I use the
> KeywordAnalyzer I don't get results.  Can someone explain this?
>
> My corpus was crawled and indexed using a nightly build of nutch  
> (with Lucene
> 2.2, just like my Luke 0.7.1), crawling a bunch of news sites.  A  
> legitimate
> result page would be:
> http://news.bbc.co.uk/2/hi/programmes/panorama/3582267.stm
>
> SimpleAnalyzer also works as does StopAnalyzer.  WhitespaceAnalyzer  
> fails.
> (SnowballAnalyzer gives me a ClassDefNotFound exception).   
> PerfieldAnalyzer
> gives me a PerfieldAnalyzerWrapper error.
--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com







       
____________________________________________________________________________________
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos &
more. 
http://mobile.yahoo.com/go?refer=1GNXIC
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message