Return-Path: X-Original-To: apmail-lucene-commits-archive@www.apache.org Delivered-To: apmail-lucene-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 488E0D3EB for ; Tue, 31 Jul 2012 21:00:55 +0000 (UTC) Received: (qmail 55970 invoked by uid 500); 31 Jul 2012 21:00:55 -0000 Mailing-List: contact commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list commits@lucene.apache.org Received: (qmail 55963 invoked by uid 99); 31 Jul 2012 21:00:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 31 Jul 2012 21:00:55 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_FRT_PROFILE2 X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 31 Jul 2012 21:00:41 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id 949432388B71; Tue, 31 Jul 2012 20:59:18 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: svn commit: r1367777 [12/14] - in /lucene/dev/branches/pforcodec_3892: ./ dev-tools/ dev-tools/eclipse/ dev-tools/maven/ dev-tools/scripts/ lucene/ lucene/analysis/ lucene/analysis/common/ lucene/analysis/common/src/java/org/apache/lucene/analysis/ar/ ... Date: Tue, 31 Jul 2012 20:59:01 -0000 To: commits@lucene.apache.org From: mikemccand@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20120731205918.949432388B71@eris.apache.org> Modified: lucene/dev/branches/pforcodec_3892/solr/CHANGES.txt URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/CHANGES.txt?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/CHANGES.txt (original) +++ lucene/dev/branches/pforcodec_3892/solr/CHANGES.txt Tue Jul 31 20:58:32 2012 @@ -34,6 +34,15 @@ Velocity 1.6.4 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.3.5 +Upgrading from Solr 4.0.0-ALPHA +---------------------- + +Solr is now much more strict about requiring that the uniqueKeyField feature +(if used) must refer to a field which is not multiValued. If you upgrade from +an earlier version of Solr and see an error that your uniqueKeyField "can not +be configured to be multivalued" please add 'multiValued="false"' to the + declaration for your uniqueKeyField. See SOLR-3682 for more details. + Detailed Change List ---------------------- @@ -81,6 +90,24 @@ New Features already exist. To assert that the document must exist, use the optimistic concurrency feature by specifying a _version_ of 1. (yonik) +* LUCENE-2510, LUCENE-4044: Migrated Solr's Tokenizer-, TokenFilter-, and + CharFilterFactories to the lucene-analysis module. To add new analysis + modules to Solr (like ICU, SmartChinese, Morfologik,...), just drop in + the JAR files from Lucene's binary distribution into your Solr instance's + lib folder. The factories are automatically made available with SPI. + (Chris Male, Robert Muir, Uwe Schindler) + +* SOLR-3634, SOLR-3635: CoreContainer and CoreAdminHandler will now remember + and report back information about failures to initialize SolrCores. These + failures will be accessible from the web UI and CoreAdminHandler STATUS + command until they are "reset" by creating/renaming a SolrCore with the + same name. (hossman, steffkes) + +* SOLR-1280: Added commented-out example of the new script update processor + to the example configuration. See http://wiki.apache.org/solr/ScriptUpdateProcessor (ehatcher) + +* SOLR-3672: SimplePostTool: Improvements for posting files + Support for auto mode, recursive and wildcards (janhoy) Bug Fixes ---------------------- @@ -104,7 +131,8 @@ Bug Fixes * SOLR-3621: Fix rare concurrency issue when opening a new IndexWriter for replication or rollback. (Mark Miller) -* SOLR-1781: Replication index directories not always cleaned up. (Terje Sten Bjerkseth, Mark Miller) +* SOLR-1781: Replication index directories not always cleaned up. + (Markus Jelsma, Terje Sten Bjerkseth, Mark Miller) * SOLR-3639: Update ZooKeeper to 3.3.5 for a variety of bug fixes. (Mark Miller) @@ -134,6 +162,18 @@ Bug Fixes * SOLR-3623: Fixed inconsistent treatment of third-party dependencies for solr contribs analysis-extras & uima (hossman) +* SOLR-3652: Fixed range faceting to error instead of looping infinitely + when 'gap' is zero -- or effectively zero due to floating point arithmetic + underflow. (hossman) + +* SOLR-3648: Fixed VelocityResponseWriter template loading in SolrCloud mode. + For the example configuration, this means /browse now works with SolrCloud. + (janhoy, ehatcher) + +* SOLR-3677: Fixed missleading error message in web ui to distinguish between + no SolrCores loaded vs. no /admin/ handler available. + (hossman, steffkes) + Other Changes ---------------------- @@ -162,6 +202,10 @@ Other Changes * SOLR-3215: Clone SolrInputDocument when distrib indexing so that update processors after the distrib update process do not process the document twice. (Mark Miller) +* SOLR-3683: Improved error handling if an contains both an + explicit class attribute, as well as nested factories. (hossman) + +* SOLR-3682: Fail to parse schema.xml if uniqueKeyField is multivalued (hossman) ================== 4.0.0-ALPHA ================== More information about this release, including any errata related to the @@ -510,6 +554,11 @@ New Features * SOLR-3542: Add WeightedFragListBuilder for FVH and set it to default fragListBuilder in example solrconfig.xml. (Sebastian Lutze, koji) +* SOLR-2396: Add ICUCollationField to contrib/analysis-extras, which is much + more efficient than the Solr 3.x ICUCollationKeyFilterFactory, and also + supports Locale-sensitive range queries. (rmuir) + + Optimizations ---------------------- @@ -657,6 +706,10 @@ Bug Fixes the hashCode implementation of {!bbox} and {!geofilt} queries. (hossman) +* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories + are respected now (Stanislaw Osinski, Dawid Weiss) + + Other Changes ---------------------- @@ -842,6 +895,9 @@ Bug Fixes: * SOLR-3477: SOLR does not start up when no cores are defined (Tomás Fernández Löbbe via tommaso) +* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories + are respected now (Stanislaw Osinski, Dawid Weiss) + ================== 3.6.0 ================== More information about this release, including any errata related to the release notes, upgrade instructions, or other changes may be found online at: @@ -984,6 +1040,16 @@ New Features exception from being thrown by the default parser if "q" is missing. (yonik) SOLR-435: if q is "" then it's also acceptable. (dsmiley, hoss) +* SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory. + These can be used to customize range query/sort behavior, for example to + support numeric collation, ignore punctuation/whitespace, ignore accents but + not case, control whether upper/lowercase values are sorted first, etc. (rmuir) + +* SOLR-2346: Add a chance to set content encoding explicitly via content type + of stream for extracting request handler. This is convenient when Tika's + auto detector cannot detect encoding, especially the text file is too short + to detect encoding. (koji) + Optimizations ---------------------- * SOLR-1931: Speedup for LukeRequestHandler and admin/schema browser. New parameter @@ -1145,6 +1211,35 @@ Bug Fixes * SOLR-3316: Distributed grouping failed when rows parameter was set to 0 and sometimes returned a wrong hit count as matches. (Cody Young, Martijn van Groningen) +* SOLR-3107: contrib/langid: When using the LangDetect implementation of + langid, set the random seed to 0, so that the same document is detected as + the same language with the same probability every time. + (Christian Moen via rmuir) + +* SOLR-2937: Configuring the number of contextual snippets used for + search results clustering. The hl.snippets parameter is now respected + by the clustering plugin, can be overridden by carrot.summarySnippets + if needed (Stanislaw Osinski). + +* SOLR-2938: Clustering on multiple fields. The carrot.title and + carrot.snippet can now take comma- or space-separated lists of + field names to cluster (Stanislaw Osinski). + +* SOLR-2939: Clustering of multilingual search results. The document's + language field be passed in the carrot.lang parameter, the carrot.lcmap + parameter enables mapping of language codes to ISO 639 (Stanislaw Osinski). + +* SOLR-2940: Passing values for custom Carrot2 fields to Clustering component. + The custom field mapping are defined using the carrot.custom parameter + (Stanislaw Osinski). + +* SOLR-2941: NullPointerException on clustering component initialization + when schema does not have a unique key field (Stanislaw Osinski). + +* SOLR-2942: ClassCastException when passing non-textual fields to + clustering component (Stanislaw Osinski). + + Other Changes ---------------------- * SOLR-2922: Upgrade commons-io and commons-lang to 2.1 and 2.6, respectively. (koji) @@ -1196,6 +1291,12 @@ Other Changes repository). Also updated dependencies jackson-core-asl and jackson-mapper-asl (both v1.5.2 -> v1.7.4). (Dawid Weiss, Steve Rowe) +* SOLR-3295: netcdf jar is excluded from the binary release (and disabled in + ivy.xml) because it requires java 6. If you want to parse this content with + extracting request handler and are willing to use java 6, just add the jar. + (rmuir) + + Build ---------------------- * SOLR-2487: Add build target to package war without slf4j jars (janhoy) @@ -1250,6 +1351,9 @@ New Features request param that can be used to delete all but the most recent N backups. (James Dyer via hossman) +* SOLR-2839: Add alternative implementation to contrib/langid supporting 53 + languages, based on http://code.google.com/p/language-detection/ (rmuir) + Optimizations ---------------------- @@ -1326,6 +1430,9 @@ Bug Fixes * SOLR-2591: Remove commitLockTimeout option from solrconfig.xml (Luca Cavanna via Martijn van Groningen) +* SOLR-2746: Upgraded UIMA dependencies from *-2.3.1-SNAPSHOT.jar to *-2.3.1.jar. + + ================== 3.4.0 ================== Upgrading from Solr 3.3 @@ -1472,12 +1579,21 @@ Bug Fixes failed due to sort by function changes introduced in SOLR-1297 (Mitsu Hadeishi, hossman) +* SOLR-2706: contrib/clustering: The carrot.lexicalResourcesDir parameter + now works with absolute directories (Stanislaw Osinski) + +* SOLR-2692: contrib/clustering: Typo in param name fixed: "carrot.fragzise" + changed to "carrot.fragSize" (Stanislaw Osinski). + Other Changes ---------------------- * SOLR-2629: Eliminate deprecation warnings in some JSPs. (Bernd Fehling, hossman) +* SOLR-2743: Remove commons logging from contrib/extraction. (koji) + + Build ---------------------- @@ -1549,6 +1665,13 @@ New Features * SOLR-2610 -- Add an option to delete index through CoreAdmin UNLOAD action (shalin) +* SOLR-2480: Add ignoreTikaException flag to the extraction request handler so + that users can ignore TikaException but index meta data. + (Shinichiro Abe, koji) + +* SOLR-2582: Use uniqueKey for error log in UIMAUpdateRequestProcessor. + (Tommaso Teofili via koji) + Optimizations ---------------------- @@ -1568,6 +1691,12 @@ Bug Fixes parameter is added to avoid excessive CPU time in extreme cases (e.g. long queries with many misspelled words). (James Dyer via rmuir) +* SOLR-2579: UIMAUpdateRequestProcessor ignore error fails if text.length() < 100. + (Elmer Garduno via koji) + +* SOLR-2581: UIMAToSolrMapper wrongly instantiates Type with reflection. + (Tommaso Teofili via koji) + Other Changes ---------------------- @@ -1607,6 +1736,10 @@ Upgrading from Solr 3.1 with update.chain rather than update.processor. The latter still works, but has been deprecated. +* just beneath ... is no longer supported. + It should move to UIMAUpdateRequestProcessorFactory setting. + See contrib/uima/README.txt for more details. (SOLR-2436) + Detailed Change List ---------------------- @@ -1627,6 +1760,18 @@ New Features Explanation objects in it's responses instead of Explanation.toString (hossman) +* SOLR-2448: Search results clustering updates: bisecting k-means + clustering algorithm added, loading of Carrot2 stop words from + /conf/carrot2 (SOLR-2449), using Solr's stopwords.txt + for clustering (SOLR-2450), output of cluster scores (SOLR-2505) + (Stanislaw Osinski, Dawid Weiss). + +* SOLR-2503: extend UIMAUpdateRequestProcessorFactory mapping function to + map feature value to dynamicField. (koji) + +* SOLR-2512: add ignoreErrors flag to UIMAUpdateRequestProcessorFactory so + that users can ignore exceptions in AE. (Tommaso Teofili, koji) + Optimizations ---------------------- @@ -1713,6 +1858,12 @@ Other Changes * SOLR-2528: Remove default="true" from HtmlEncoder in example solrconfig.xml, because html encoding confuses non-ascii users. (koji) +* SOLR-2387: add mock annotators for improved testing in contrib/uima, + (Tommaso Teofili via rmuir) + +* SOLR-2436: move uimaConfig to under the uima's update processor in + solrconfig.xml. (Tommaso Teofili, koji) + Build ---------------------- @@ -1970,6 +2121,26 @@ New Features * SOLR-1057: Add PathHierarchyTokenizerFactory. (ryan, koji) +* SOLR-1804: Re-enabled clustering component on trunk, updated to latest + version of Carrot2. No more LGPL run-time dependencies. This release of + C2 also does not have a specific Lucene dependency. + (Stanislaw Osinski, gsingers) + +* SOLR-2282: Add distributed search support for search result clustering. + (Brad Giaccio, Dawid Weiss, Stanislaw Osinski, rmuir, koji) + +* SOLR-2210: Add icu-based tokenizer and filters to contrib/analysis-extras (rmuir) + +* SOLR-1336: Add SmartChinese (word segmentation for Simplified Chinese) + tokenizer and filters to contrib/analysis-extras (rmuir) + +* SOLR-2211,LUCENE-2763: Added UAX29URLEmailTokenizerFactory, which implements + UAX#29, a unicode algorithm with good results for most languages, as well as + URL and E-mail tokenization according to the relevant RFCs. + (Tom Burton-West via rmuir) + +* SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras (rmuir) + Optimizations ---------------------- @@ -1991,6 +2162,10 @@ Optimizations * SOLR-2046: add common functions to scripts-util. (koji) +* SOLR-1684: Switch clustering component to use the + SolrIndexSearcher.doc(int, Set) method b/c it can use the document + cache (gsingers) + Bug Fixes ---------------------- * SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble) @@ -2245,6 +2420,15 @@ Bug Fixes * SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not thread safe and could throw an exception. (yonik) +* SOLR-1692: Fix bug in clustering component relating to carrot.produceSummary + option (gsingers) + +* SOLR-1756: The date.format setting for extraction request handler causes + ClassCastException when enabled and the config code that parses this setting + does not properly use the same iterator instance. + (Christoph Brill, Mark Miller) + + Other Changes ---------------------- @@ -2372,6 +2556,10 @@ Other Changes * SOLR-141: Errors and Exceptions are formated by ResponseWriter. (Mike Sokolov, Rich Cariens, Daniel Naber, ryan) +* SOLR-1902: Upgraded to Tika 0.8 and changed deprecated parse call + +* SOLR-1813: Add ICU4j to contrib/extraction libs and add tests for Arabic + extraction (Robert Muir via gsingers) Build ---------------------- @@ -2742,6 +2930,11 @@ New Features 84. SOLR-1449: Add elements to solrconfig.xml to specifying additional classpath directories and regular expressions. (hossman via yonik) +85. SOLR-1128: Added metadata output to extraction request handler "extract + only" option. (gsingers) + +86. SOLR-1274: Added text serialization output for extractOnly + (Peter Wolanin, gsingers) Optimizations ---------------------- @@ -3156,6 +3349,14 @@ Other Changes 50. SOLR-1357 SolrInputDocument cannot process dynamic fields (Lars Grote via noble) +51. SOLR-1075: Upgrade to Tika 0.3. See http://www.apache.org/dist/lucene/tika/CHANGES-0.3.txt (gsingers) + +52. SOLR-1310: Upgrade to Tika 0.4. Note there are some differences in + detecting Languages now in extracting request handler. + See http://www.lucidimagination.com/search/document/d6f1899a85b2a45c/vote_apache_tika_0_4_release_candidate_2#d6f1899a85b2a45c + for discussion on language detection. + See http://www.apache.org/dist/lucene/tika/CHANGES-0.4.txt. (gsingers) + Build ---------------------- 1. SOLR-776: Added in ability to sign artifacts via Ant for releases (gsingers) Modified: lucene/dev/branches/pforcodec_3892/solr/build.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/build.xml?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/build.xml (original) +++ lucene/dev/branches/pforcodec_3892/solr/build.xml Tue Jul 31 20:58:32 2012 @@ -174,18 +174,18 @@ - + - + - - - - + + + + @@ -243,7 +243,7 @@ - + Modified: lucene/dev/branches/pforcodec_3892/solr/common-build.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/common-build.xml?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/common-build.xml (original) +++ lucene/dev/branches/pforcodec_3892/solr/common-build.xml Tue Jul 31 20:58:32 2012 @@ -221,23 +221,23 @@ - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + @@ -248,8 +248,7 @@ depends="define-lucene-javadoc-url-SNAPSHOT,define-lucene-javadoc-url-release"/> - + Modified: lucene/dev/branches/pforcodec_3892/solr/contrib/analysis-extras/ivy.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/contrib/analysis-extras/ivy.xml?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/contrib/analysis-extras/ivy.xml (original) +++ lucene/dev/branches/pforcodec_3892/solr/contrib/analysis-extras/ivy.xml Tue Jul 31 20:58:32 2012 @@ -19,7 +19,7 @@ - + Modified: lucene/dev/branches/pforcodec_3892/solr/contrib/clustering/src/java/org/apache/solr/handler/clustering/carrot2/SolrStopwordsCarrot2LexicalDataFactory.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/contrib/clustering/src/java/org/apache/solr/handler/clustering/carrot2/SolrStopwordsCarrot2LexicalDataFactory.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/contrib/clustering/src/java/org/apache/solr/handler/clustering/carrot2/SolrStopwordsCarrot2LexicalDataFactory.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/contrib/clustering/src/java/org/apache/solr/handler/clustering/carrot2/SolrStopwordsCarrot2LexicalDataFactory.java Tue Jul 31 20:58:32 2012 @@ -23,8 +23,8 @@ import java.util.Set; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.util.CharArraySet; import org.apache.lucene.analysis.util.TokenFilterFactory; -import org.apache.solr.analysis.CommonGramsFilterFactory; -import org.apache.solr.analysis.StopFilterFactory; +import org.apache.lucene.analysis.commongrams.CommonGramsFilterFactory; +import org.apache.lucene.analysis.core.StopFilterFactory; import org.apache.solr.analysis.TokenizerChain; import org.apache.solr.schema.IndexSchema; import org.carrot2.core.LanguageCode; Modified: lucene/dev/branches/pforcodec_3892/solr/contrib/extraction/ivy.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/contrib/extraction/ivy.xml?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/contrib/extraction/ivy.xml (original) +++ lucene/dev/branches/pforcodec_3892/solr/contrib/extraction/ivy.xml Tue Jul 31 20:58:32 2012 @@ -43,7 +43,6 @@ - @@ -52,8 +51,8 @@ - - + + Modified: lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/java/org/apache/solr/update/processor/LanguageIdentifierUpdateProcessor.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/java/org/apache/solr/update/processor/LanguageIdentifierUpdateProcessor.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/java/org/apache/solr/update/processor/LanguageIdentifierUpdateProcessor.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/java/org/apache/solr/update/processor/LanguageIdentifierUpdateProcessor.java Tue Jul 31 20:58:32 2012 @@ -25,6 +25,7 @@ import org.apache.solr.common.params.Sol import org.apache.solr.request.SolrQueryRequest; import org.apache.solr.response.SolrQueryResponse; import org.apache.solr.schema.IndexSchema; +import org.apache.solr.schema.SchemaField; import org.apache.solr.update.AddUpdateCommand; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -97,7 +98,8 @@ public abstract class LanguageIdentifier } langField = params.get(LANG_FIELD, DOCID_LANGFIELD_DEFAULT); langsField = params.get(LANGS_FIELD, DOCID_LANGSFIELD_DEFAULT); - docIdField = params.get(DOCID_PARAM, DOCID_FIELD_DEFAULT); + SchemaField uniqueKeyField = schema.getUniqueKeyField(); + docIdField = params.get(DOCID_PARAM, uniqueKeyField == null ? DOCID_FIELD_DEFAULT : uniqueKeyField.getName()); fallbackValue = params.get(FALLBACK); if(params.get(FALLBACK_FIELDS, "").length() > 0) { fallbackFields = params.get(FALLBACK_FIELDS).split(","); Modified: lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/test/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessorFactoryTest.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/test/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessorFactoryTest.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/test/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessorFactoryTest.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/test/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessorFactoryTest.java Tue Jul 31 20:58:32 2012 @@ -24,7 +24,7 @@ import org.junit.Test; public class LangDetectLanguageIdentifierUpdateProcessorFactoryTest extends LanguageIdentifierUpdateProcessorFactoryTestCase { @Override protected LanguageIdentifierUpdateProcessor createLangIdProcessor(ModifiableSolrParams parameters) throws Exception { - return new LangDetectLanguageIdentifierUpdateProcessor(_parser.buildRequestFrom(null, parameters, null), resp, null); + return new LangDetectLanguageIdentifierUpdateProcessor(_parser.buildRequestFrom(h.getCore(), parameters, null), resp, null); } // this one actually works better it seems with short docs Modified: lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/test/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessorFactoryTest.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/test/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessorFactoryTest.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/test/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessorFactoryTest.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/contrib/langid/src/test/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessorFactoryTest.java Tue Jul 31 20:58:32 2012 @@ -22,6 +22,6 @@ import org.apache.solr.common.params.Mod public class TikaLanguageIdentifierUpdateProcessorFactoryTest extends LanguageIdentifierUpdateProcessorFactoryTestCase { @Override protected LanguageIdentifierUpdateProcessor createLangIdProcessor(ModifiableSolrParams parameters) throws Exception { - return new TikaLanguageIdentifierUpdateProcessor(_parser.buildRequestFrom(null, parameters, null), resp, null); + return new TikaLanguageIdentifierUpdateProcessor(_parser.buildRequestFrom(h.getCore(), parameters, null), resp, null); } } Modified: lucene/dev/branches/pforcodec_3892/solr/contrib/uima/README.txt URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/contrib/uima/README.txt?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/contrib/uima/README.txt (original) +++ lucene/dev/branches/pforcodec_3892/solr/contrib/uima/README.txt Tue Jul 31 20:58:32 2012 @@ -1,3 +1,15 @@ +Apache Solr UIMA Metadata Extraction Library + +Introduction +------------ +This module is intended to be used both as an UpdateRequestProcessor while indexing documents and as a set of tokenizer/filters +to be configured inside the schema.xml for use during analysis phase. +UIMAUpdateRequestProcessor purpose is to provide additional on the fly automatically generated fields to the Solr index. +Such fields could be language, concepts, keywords, sentences, named entities, etc. +UIMA based tokenizers/filters can be used either inside plain Lucene or as index/query analyzers to be defined +inside the schema.xml of a Solr core to create/filter tokens using specific UIMA annotations. + + Getting Started --------------- To start using Solr UIMA Metadata Extraction Library you should go through the following configuration steps: Modified: lucene/dev/branches/pforcodec_3892/solr/contrib/uima/src/test-files/uima/uima-tokenizers-schema.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/contrib/uima/src/test-files/uima/uima-tokenizers-schema.xml?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/contrib/uima/src/test-files/uima/uima-tokenizers-schema.xml (original) +++ lucene/dev/branches/pforcodec_3892/solr/contrib/uima/src/test-files/uima/uima-tokenizers-schema.xml Tue Jul 31 20:58:32 2012 @@ -299,14 +299,14 @@ - - Modified: lucene/dev/branches/pforcodec_3892/solr/contrib/velocity/src/java/org/apache/solr/response/SolrVelocityResourceLoader.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/contrib/velocity/src/java/org/apache/solr/response/SolrVelocityResourceLoader.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/contrib/velocity/src/java/org/apache/solr/response/SolrVelocityResourceLoader.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/contrib/velocity/src/java/org/apache/solr/response/SolrVelocityResourceLoader.java Tue Jul 31 20:58:32 2012 @@ -22,9 +22,12 @@ import org.apache.velocity.exception.Res import org.apache.commons.collections.ExtendedProperties; import org.apache.solr.core.SolrResourceLoader; +import java.io.IOException; import java.io.InputStream; -// TODO: the name of this class seems ridiculous +/** + * Velocity resource loader wrapper around Solr resource loader + */ public class SolrVelocityResourceLoader extends ResourceLoader { private SolrResourceLoader loader; @@ -39,7 +42,11 @@ public class SolrVelocityResourceLoader @Override public InputStream getResourceStream(String template_name) throws ResourceNotFoundException { - return loader.openResource(template_name); + try { + return loader.openResource("velocity/" + template_name); + } catch (IOException ioe) { + throw new ResourceNotFoundException(ioe); + } } @Override Modified: lucene/dev/branches/pforcodec_3892/solr/contrib/velocity/src/java/org/apache/solr/response/VelocityResponseWriter.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/contrib/velocity/src/java/org/apache/solr/response/VelocityResponseWriter.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/contrib/velocity/src/java/org/apache/solr/response/VelocityResponseWriter.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/contrib/velocity/src/java/org/apache/solr/response/VelocityResponseWriter.java Tue Jul 31 20:58:32 2012 @@ -20,6 +20,7 @@ package org.apache.solr.response; import org.apache.solr.client.solrj.SolrResponse; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.client.solrj.response.SolrResponseBase; +import org.apache.solr.common.SolrException; import org.apache.solr.common.util.NamedList; import org.apache.solr.request.SolrQueryRequest; import org.apache.velocity.Template; @@ -113,19 +114,32 @@ public class VelocityResponseWriter impl private VelocityEngine getEngine(SolrQueryRequest request) { VelocityEngine engine = new VelocityEngine(); - String template_root = request.getParams().get("v.base_dir"); - File baseDir = new File(request.getCore().getResourceLoader().getConfigDir(), "velocity"); - if (template_root != null) { - baseDir = new File(template_root); - } - engine.setProperty(RuntimeConstants.FILE_RESOURCE_LOADER_PATH, baseDir.getAbsolutePath()); + engine.setProperty("params.resource.loader.instance", new SolrParamResourceLoader(request)); SolrVelocityResourceLoader resourceLoader = new SolrVelocityResourceLoader(request.getCore().getSolrConfig().getResourceLoader()); engine.setProperty("solr.resource.loader.instance", resourceLoader); + File fileResourceLoaderBaseDir = null; + try { + String template_root = request.getParams().get("v.base_dir"); + fileResourceLoaderBaseDir = new File(request.getCore().getResourceLoader().getConfigDir(), "velocity"); + if (template_root != null) { + fileResourceLoaderBaseDir = new File(template_root); + } + } catch (SolrException e) { + // no worries... probably in ZooKeeper mode and getConfigDir() isn't available, so we'll just ignore omit + // the file system resource loader + } + + if (fileResourceLoaderBaseDir != null) { + engine.setProperty(RuntimeConstants.FILE_RESOURCE_LOADER_PATH, fileResourceLoaderBaseDir.getAbsolutePath()); + engine.setProperty(RuntimeConstants.RESOURCE_LOADER, "params,file,solr"); + } else { + engine.setProperty(RuntimeConstants.RESOURCE_LOADER, "params,solr"); + } + // TODO: Externalize Velocity properties - engine.setProperty(RuntimeConstants.RESOURCE_LOADER, "params,file,solr"); String propFile = request.getParams().get("v.properties"); try { if (propFile == null) Modified: lucene/dev/branches/pforcodec_3892/solr/core/build.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/core/build.xml?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/core/build.xml (original) +++ lucene/dev/branches/pforcodec_3892/solr/core/build.xml Tue Jul 31 20:58:32 2012 @@ -21,6 +21,8 @@ + + Modified: lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/analysis/LegacyHTMLStripCharFilterFactory.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/analysis/LegacyHTMLStripCharFilterFactory.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/analysis/LegacyHTMLStripCharFilterFactory.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/analysis/LegacyHTMLStripCharFilterFactory.java Tue Jul 31 20:58:32 2012 @@ -20,6 +20,7 @@ package org.apache.solr.analysis; import java.io.Reader; +import org.apache.lucene.analysis.charfilter.HTMLStripCharFilterFactory; import org.apache.lucene.analysis.util.CharFilterFactory; /** Modified: lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/cloud/RecoveryStrategy.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/cloud/RecoveryStrategy.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/cloud/RecoveryStrategy.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/cloud/RecoveryStrategy.java Tue Jul 31 20:58:32 2012 @@ -18,6 +18,7 @@ package org.apache.solr.cloud; */ import java.io.IOException; +import java.util.ArrayList; import java.util.Collections; import java.util.List; import java.util.concurrent.ExecutionException; @@ -58,7 +59,7 @@ import org.slf4j.LoggerFactory; public class RecoveryStrategy extends Thread implements SafeStopThread { private static final int MAX_RETRIES = 500; private static final int INTERRUPTED = MAX_RETRIES + 1; - private static final int START_TIMEOUT = 100; + private static final int STARTING_RECOVERY_DELAY = 1000; private static final String REPLICATION_HANDLER = "/replication"; @@ -243,7 +244,10 @@ public class RecoveryStrategy extends Th UpdateLog.RecentUpdates recentUpdates = ulog.getRecentUpdates(); try { recentVersions = recentUpdates.getVersions(ulog.numRecordsToKeep); - } finally { + } catch (Throwable t) { + SolrException.log(log, "Corrupt tlog - ignoring", t); + recentVersions = new ArrayList(0); + }finally { recentUpdates.close(); } @@ -409,10 +413,11 @@ public class RecoveryStrategy extends Th } try { - // if (!isClosed()) Thread.sleep(Math.min(START_TIMEOUT * retries, 60000)); - for (int i = 0; i cores = new LinkedHashMap(); + + protected final Map coreInitFailures = + Collections.synchronizedMap(new LinkedHashMap()); + protected boolean persistent = false; protected String adminPath = null; protected String managementPath = null; @@ -676,6 +680,7 @@ public class CoreContainer throw new IllegalStateException("This CoreContainer has been shutdown"); } old = cores.put(name, core); + coreInitFailures.remove(name); /* * set both the name of the descriptor and the name of the * core, since the descriptors name is used for persisting. @@ -750,105 +755,136 @@ public class CoreContainer * @throws org.xml.sax.SAXException */ public SolrCore create(CoreDescriptor dcore) throws ParserConfigurationException, IOException, SAXException { - // Make the instanceDir relative to the cores instanceDir if not absolute - File idir = new File(dcore.getInstanceDir()); - if (!idir.isAbsolute()) { - idir = new File(solrHome, dcore.getInstanceDir()); - } - String instanceDir = idir.getPath(); - log.info("Creating SolrCore '{}' using instanceDir: {}", - dcore.getName(), instanceDir); - // Initialize the solr config - SolrResourceLoader solrLoader = null; - - SolrConfig config = null; - String zkConfigName = null; - if(zkController == null) { - solrLoader = new SolrResourceLoader(instanceDir, libLoader, getCoreProps(instanceDir, dcore.getPropertiesName(),dcore.getCoreProperties())); - config = new SolrConfig(solrLoader, dcore.getConfigName(), null); - } else { - try { - String collection = dcore.getCloudDescriptor().getCollectionName(); - zkController.createCollectionZkNode(dcore.getCloudDescriptor()); - zkConfigName = zkController.readConfigName(collection); - if (zkConfigName == null) { - log.error("Could not find config name for collection:" + collection); - throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, - "Could not find config name for collection:" + collection); - } - solrLoader = new ZkSolrResourceLoader(instanceDir, zkConfigName, libLoader, getCoreProps(instanceDir, dcore.getPropertiesName(),dcore.getCoreProperties()), zkController); - config = getSolrConfigFromZk(zkConfigName, dcore.getConfigName(), solrLoader); - } catch (KeeperException e) { - log.error("", e); - throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, - "", e); - } catch (InterruptedException e) { - // Restore the interrupted status - Thread.currentThread().interrupt(); - log.error("", e); - throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, - "", e); - } - } - - IndexSchema schema = null; - if (indexSchemaCache != null) { - if (zkController != null) { - File schemaFile = new File(dcore.getSchemaName()); - if (!schemaFile.isAbsolute()) { - schemaFile = new File(solrLoader.getInstanceDir() + "conf" - + File.separator + dcore.getSchemaName()); - } - if (schemaFile.exists()) { - String key = schemaFile.getAbsolutePath() - + ":" - + new SimpleDateFormat("yyyyMMddHHmmss", Locale.ROOT).format(new Date( - schemaFile.lastModified())); - schema = indexSchemaCache.get(key); - if (schema == null) { - log.info("creating new schema object for core: " + dcore.name); - schema = new IndexSchema(config, dcore.getSchemaName(), null); - indexSchemaCache.put(key, schema); - } else { - log.info("re-using schema object for core: " + dcore.name); - } - } + // :TODO: would be really nice if this method wrapped any underlying errors and only threw SolrException + + final String name = dcore.getName(); + Exception failure = null; + + try { + // Make the instanceDir relative to the cores instanceDir if not absolute + File idir = new File(dcore.getInstanceDir()); + if (!idir.isAbsolute()) { + idir = new File(solrHome, dcore.getInstanceDir()); + } + String instanceDir = idir.getPath(); + log.info("Creating SolrCore '{}' using instanceDir: {}", + dcore.getName(), instanceDir); + // Initialize the solr config + SolrResourceLoader solrLoader = null; + + SolrConfig config = null; + String zkConfigName = null; + if(zkController == null) { + solrLoader = new SolrResourceLoader(instanceDir, libLoader, getCoreProps(instanceDir, dcore.getPropertiesName(),dcore.getCoreProperties())); + config = new SolrConfig(solrLoader, dcore.getConfigName(), null); } else { - // TODO: handle caching from ZooKeeper - perhaps using ZooKeepers versioning - // Don't like this cache though - how does it empty as last modified changes? - } - } - if(schema == null){ - if(zkController != null) { try { - schema = getSchemaFromZk(zkConfigName, dcore.getSchemaName(), config, solrLoader); + String collection = dcore.getCloudDescriptor().getCollectionName(); + zkController.createCollectionZkNode(dcore.getCloudDescriptor()); + + zkConfigName = zkController.readConfigName(collection); + if (zkConfigName == null) { + log.error("Could not find config name for collection:" + collection); + throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, + "Could not find config name for collection:" + collection); + } + solrLoader = new ZkSolrResourceLoader(instanceDir, zkConfigName, libLoader, getCoreProps(instanceDir, dcore.getPropertiesName(),dcore.getCoreProperties()), zkController); + config = getSolrConfigFromZk(zkConfigName, dcore.getConfigName(), solrLoader); } catch (KeeperException e) { log.error("", e); throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, - "", e); + "", e); } catch (InterruptedException e) { // Restore the interrupted status Thread.currentThread().interrupt(); log.error("", e); throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, - "", e); + "", e); + } + } + + IndexSchema schema = null; + if (indexSchemaCache != null) { + if (zkController != null) { + File schemaFile = new File(dcore.getSchemaName()); + if (!schemaFile.isAbsolute()) { + schemaFile = new File(solrLoader.getInstanceDir() + "conf" + + File.separator + dcore.getSchemaName()); + } + if (schemaFile.exists()) { + String key = schemaFile.getAbsolutePath() + + ":" + + new SimpleDateFormat("yyyyMMddHHmmss", Locale.ROOT).format(new Date( + schemaFile.lastModified())); + schema = indexSchemaCache.get(key); + if (schema == null) { + log.info("creating new schema object for core: " + dcore.name); + schema = new IndexSchema(config, dcore.getSchemaName(), null); + indexSchemaCache.put(key, schema); + } else { + log.info("re-using schema object for core: " + dcore.name); + } + } + } else { + // TODO: handle caching from ZooKeeper - perhaps using ZooKeepers versioning + // Don't like this cache though - how does it empty as last modified changes? + } + } + if(schema == null){ + if(zkController != null) { + try { + schema = getSchemaFromZk(zkConfigName, dcore.getSchemaName(), config, solrLoader); + } catch (KeeperException e) { + log.error("", e); + throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, + "", e); + } catch (InterruptedException e) { + // Restore the interrupted status + Thread.currentThread().interrupt(); + log.error("", e); + throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, + "", e); + } + } else { + schema = new IndexSchema(config, dcore.getSchemaName(), null); } - } else { - schema = new IndexSchema(config, dcore.getSchemaName(), null); } - } - SolrCore core = new SolrCore(dcore.getName(), null, config, schema, dcore); + SolrCore core = new SolrCore(dcore.getName(), null, config, schema, dcore); - if (zkController == null && core.getUpdateHandler().getUpdateLog() != null) { - // always kick off recovery if we are in standalone mode. - core.getUpdateHandler().getUpdateLog().recoverFromLog(); - } + if (zkController == null && core.getUpdateHandler().getUpdateLog() != null) { + // always kick off recovery if we are in standalone mode. + core.getUpdateHandler().getUpdateLog().recoverFromLog(); + } + + return core; - return core; + // :TODO: Java7... + // http://docs.oracle.com/javase/7/docs/technotes/guides/language/catch-multiple.html + } catch (ParserConfigurationException e1) { + failure = e1; + throw e1; + } catch (IOException e2) { + failure = e2; + throw e2; + } catch (SAXException e3) { + failure = e3; + throw e3; + } catch (RuntimeException e4) { + failure = e4; + throw e4; + } finally { + synchronized (coreInitFailures) { + // remove first so insertion order is updated and newest is last + coreInitFailures.remove(name); + if (null != failure) { + coreInitFailures.put(name, failure); + } + } + } } - + /** * @return a Collection of registered SolrCores */ @@ -886,6 +922,32 @@ public class CoreContainer return lst; } + /** + * Returns an immutable Map of Exceptions that occured when initializing + * SolrCores (either at startup, or do to runtime requests to create cores) + * keyed off of the name (String) of the SolrCore that had the Exception + * during initialization. + *

+ * While the Map returned by this method is immutable and will not change + * once returned to the client, the source data used to generate this Map + * can be changed as various SolrCore operations are performed: + *

+ *
    + *
  • Failed attempts to create new SolrCores will add new Exceptions.
  • + *
  • Failed attempts to re-create a SolrCore using a name already contained in this Map will replace the Exception.
  • + *
  • Failed attempts to reload a SolrCore will cause an Exception to be added to this list -- even though the existing SolrCore with that name will continue to be available.
  • + *
  • Successful attempts to re-created a SolrCore using a name already contained in this Map will remove the Exception.
  • + *
  • Registering an existing SolrCore with a name already contained in this Map (ie: ALIAS or SWAP) will remove the Exception.
  • + *
+ */ + public Map getCoreInitFailures() { + synchronized ( coreInitFailures ) { + return Collections.unmodifiableMap(new LinkedHashMap + (coreInitFailures)); + } + } + + // ---------------- Core name related methods --------------- /** * Recreates a SolrCore. @@ -897,61 +959,90 @@ public class CoreContainer * @throws IOException * @throws SAXException */ - public void reload(String name) throws ParserConfigurationException, IOException, SAXException { - name= checkDefault(name); - SolrCore core; - synchronized(cores) { - core = cores.get(name); - } - if (core == null) - throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "No such core: " + name ); - CoreDescriptor cd = core.getCoreDescriptor(); + // :TODO: would be really nice if this method wrapped any underlying errors and only threw SolrException + + Exception failure = null; + try { + + name= checkDefault(name); + SolrCore core; + synchronized(cores) { + core = cores.get(name); + } + if (core == null) + throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "No such core: " + name ); + + CoreDescriptor cd = core.getCoreDescriptor(); - File instanceDir = new File(cd.getInstanceDir()); - if (!instanceDir.isAbsolute()) { - instanceDir = new File(getSolrHome(), cd.getInstanceDir()); - } + File instanceDir = new File(cd.getInstanceDir()); + if (!instanceDir.isAbsolute()) { + instanceDir = new File(getSolrHome(), cd.getInstanceDir()); + } - log.info("Reloading SolrCore '{}' using instanceDir: {}", - cd.getName(), instanceDir.getAbsolutePath()); + log.info("Reloading SolrCore '{}' using instanceDir: {}", + cd.getName(), instanceDir.getAbsolutePath()); - SolrResourceLoader solrLoader; - if(zkController == null) { - solrLoader = new SolrResourceLoader(instanceDir.getAbsolutePath(), libLoader, getCoreProps(instanceDir.getAbsolutePath(), cd.getPropertiesName(),cd.getCoreProperties())); - } else { - try { - String collection = cd.getCloudDescriptor().getCollectionName(); - zkController.createCollectionZkNode(cd.getCloudDescriptor()); + SolrResourceLoader solrLoader; + if(zkController == null) { + solrLoader = new SolrResourceLoader(instanceDir.getAbsolutePath(), libLoader, getCoreProps(instanceDir.getAbsolutePath(), cd.getPropertiesName(),cd.getCoreProperties())); + } else { + try { + String collection = cd.getCloudDescriptor().getCollectionName(); + zkController.createCollectionZkNode(cd.getCloudDescriptor()); - String zkConfigName = zkController.readConfigName(collection); - if (zkConfigName == null) { - log.error("Could not find config name for collection:" + collection); + String zkConfigName = zkController.readConfigName(collection); + if (zkConfigName == null) { + log.error("Could not find config name for collection:" + collection); + throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, + "Could not find config name for collection:" + collection); + } + solrLoader = new ZkSolrResourceLoader(instanceDir.getAbsolutePath(), zkConfigName, libLoader, getCoreProps(instanceDir.getAbsolutePath(), cd.getPropertiesName(),cd.getCoreProperties()), zkController); + } catch (KeeperException e) { + log.error("", e); throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, - "Could not find config name for collection:" + collection); + "", e); + } catch (InterruptedException e) { + // Restore the interrupted status + Thread.currentThread().interrupt(); + log.error("", e); + throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, + "", e); } - solrLoader = new ZkSolrResourceLoader(instanceDir.getAbsolutePath(), zkConfigName, libLoader, getCoreProps(instanceDir.getAbsolutePath(), cd.getPropertiesName(),cd.getCoreProperties()), zkController); - } catch (KeeperException e) { - log.error("", e); - throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, - "", e); - } catch (InterruptedException e) { - // Restore the interrupted status - Thread.currentThread().interrupt(); - log.error("", e); - throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, - "", e); } - } - SolrCore newCore = core.reload(solrLoader, core); - // keep core to orig name link - String origName = coreToOrigName.remove(core); - if (origName != null) { - coreToOrigName.put(newCore, origName); + SolrCore newCore = core.reload(solrLoader, core); + // keep core to orig name link + String origName = coreToOrigName.remove(core); + if (origName != null) { + coreToOrigName.put(newCore, origName); + } + register(name, newCore, false); + + // :TODO: Java7... + // http://docs.oracle.com/javase/7/docs/technotes/guides/language/catch-multiple.html + } catch (ParserConfigurationException e1) { + failure = e1; + throw e1; + } catch (IOException e2) { + failure = e2; + throw e2; + } catch (SAXException e3) { + failure = e3; + throw e3; + } catch (RuntimeException e4) { + failure = e4; + throw e4; + } finally { + synchronized (coreInitFailures) { + // remove first so insertion order is updated and newest is last + coreInitFailures.remove(name); + if (null != failure) { + coreInitFailures.put(name, failure); + } + } } - register(name, newCore, false); } private String checkDefault(String name) { Modified: lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/core/SolrConfig.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/core/SolrConfig.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/core/SolrConfig.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/core/SolrConfig.java Tue Jul 31 20:58:32 2012 @@ -436,9 +436,7 @@ public class SolrConfig extends Config { */ public List getPluginInfos(String type){ List result = pluginStore.get(type); - return result == null ? - (List) Collections.EMPTY_LIST: - result; + return result == null ? Collections.emptyList(): result; } public PluginInfo getPluginInfo(String type){ List result = pluginStore.get(type); @@ -446,29 +444,31 @@ public class SolrConfig extends Config { } private void initLibs() { - NodeList nodes = (NodeList) evaluate("lib", XPathConstants.NODESET); - if (nodes==null || nodes.getLength()==0) - return; + if (nodes == null || nodes.getLength() == 0) return; log.info("Adding specified lib dirs to ClassLoader"); - for (int i=0; iMUST * only be called prior to using this ResourceLoader to get any resources, otherwise - * it's behavior will be non-deterministic. + * it's behavior will be non-deterministic. You also have to {link @reloadLuceneSPI} + * before using this ResourceLoader. * * @param baseDir base directory whose children (either jars or directories of * classes) will be in the classpath, will be resolved relative @@ -145,7 +148,8 @@ public class SolrResourceLoader implemen * Adds the specific file/dir specified to the ClassLoader used by this * ResourceLoader. This method MUST * only be called prior to using this ResourceLoader to get any resources, otherwise - * it's behavior will be non-deterministic. + * it's behavior will be non-deterministic. You also have to {link #reloadLuceneSPI()} + * before using this ResourceLoader. * * @param path A jar file (or directory of classes) to be added to the classpath, * will be resolved relative the instance dir. @@ -164,6 +168,22 @@ public class SolrResourceLoader implemen } } + /** + * Reloads all Lucene SPI implementations using the new classloader. + * This method must be called after {@link #addToClassLoader(String)} + * and {@link #addToClassLoader(String,FileFilter)} before using + * this ResourceLoader. + */ + void reloadLuceneSPI() { + // Codecs: + PostingsFormat.reloadPostingsFormats(this.classLoader); + Codec.reloadCodecs(this.classLoader); + // Analysis: + CharFilterFactory.reloadCharFilters(this.classLoader); + TokenFilterFactory.reloadTokenFilters(this.classLoader); + TokenizerFactory.reloadTokenizers(this.classLoader); + } + private static URLClassLoader replaceClassLoader(final URLClassLoader oldLoader, final File base, final FileFilter filter) { @@ -248,7 +268,7 @@ public class SolrResourceLoader implemen * Override this method to customize loading schema resources. *@return the stream for the named schema */ - public InputStream openSchema(String name) { + public InputStream openSchema(String name) throws IOException { return openResource(name); } @@ -256,7 +276,7 @@ public class SolrResourceLoader implemen * Override this method to customize loading config resources. *@return the stream for the named configuration */ - public InputStream openConfig(String name) { + public InputStream openConfig(String name) throws IOException { return openResource(name); } @@ -268,7 +288,7 @@ public class SolrResourceLoader implemen * Override this method to customize loading resources. *@return the stream for the named resource */ - public InputStream openResource(String resource) { + public InputStream openResource(String resource) throws IOException { InputStream is=null; try { File f0 = new File(resource); @@ -288,10 +308,10 @@ public class SolrResourceLoader implemen if (is == null) is = classLoader.getResourceAsStream(getConfigDir() + resource); } catch (Exception e) { - throw new RuntimeException("Error opening " + resource, e); + throw new IOException("Error opening " + resource, e); } if (is==null) { - throw new RuntimeException("Can't find resource '" + resource + "' in classpath or '" + getConfigDir() + "', cwd="+System.getProperty("user.dir")); + throw new IOException("Can't find resource '" + resource + "' in classpath or '" + getConfigDir() + "', cwd="+System.getProperty("user.dir")); } return is; } @@ -333,41 +353,23 @@ public class SolrResourceLoader implemen public List getLines(String resource, Charset charset) throws IOException{ - BufferedReader input = null; - ArrayList lines; try { - input = new BufferedReader(new InputStreamReader(openResource(resource), - charset.newDecoder() - .onMalformedInput(CodingErrorAction.REPORT) - .onUnmappableCharacter(CodingErrorAction.REPORT))); - - lines = new ArrayList(); - for (String word=null; (word=input.readLine())!=null;) { - // skip initial bom marker - if (lines.isEmpty() && word.length() > 0 && word.charAt(0) == '\uFEFF') - word = word.substring(1); - // skip comments - if (word.startsWith("#")) continue; - word=word.trim(); - // skip blank lines - if (word.length()==0) continue; - lines.add(word); - } + return WordlistLoader.getLines(openResource(resource), charset); } catch (CharacterCodingException ex) { throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, - "Error loading resource (wrong encoding?): " + resource, ex); - } finally { - if (input != null) - input.close(); + "Error loading resource (wrong encoding?): " + resource, ex); } - return lines; } /* * A static map of short class name to fully qualified class name */ - private static Map classNameCache = new ConcurrentHashMap(); + private static final Map classNameCache = new ConcurrentHashMap(); + // Using this pattern, legacy analysis components from previous Solr versions are identified and delegated to SPI loader: + private static final Pattern legacyAnalysisPattern = + Pattern.compile("((\\Q"+base+".analysis.\\E)|(\\Q"+project+".\\E))([\\p{L}_$][\\p{L}\\p{N}_$]+?)(TokenFilter|Filter|Tokenizer|CharFilter)Factory"); + /** * This method loads a class either with it's FQN or a short-name (solr.class-simplename or class-simplename). * It tries to load the class with the name that is given first and if it fails, it tries all the known @@ -394,6 +396,27 @@ public class SolrResourceLoader implemen } } Class clazz = null; + + // first try legacy analysis patterns, now replaced by Lucene's Analysis package: + final Matcher m = legacyAnalysisPattern.matcher(cname); + if (m.matches()) { + final String name = m.group(4); + log.trace("Trying to load class from analysis SPI using name='{}'", name); + try { + if (CharFilterFactory.class.isAssignableFrom(expectedType)) { + return clazz = CharFilterFactory.lookupClass(name).asSubclass(expectedType); + } else if (TokenizerFactory.class.isAssignableFrom(expectedType)) { + return clazz = TokenizerFactory.lookupClass(name).asSubclass(expectedType); + } else if (TokenFilterFactory.class.isAssignableFrom(expectedType)) { + return clazz = TokenFilterFactory.lookupClass(name).asSubclass(expectedType); + } else { + log.warn("'{}' looks like an analysis factory, but caller requested different class type: {}", cname, expectedType.getName()); + } + } catch (IllegalArgumentException ex) { + // ok, we fall back to legacy loading + } + } + // first try cname == full name try { return Class.forName(cname, true, classLoader).asSubclass(expectedType); @@ -425,6 +448,12 @@ public class SolrResourceLoader implemen } } } + + static final String empty[] = new String[0]; + + public T newInstance(String name, Class expectedType) { + return newInstance(name, expectedType, empty); + } public T newInstance(String cname, Class expectedType, String ... subpackages) { Class clazz = findClass(cname, expectedType, subpackages); @@ -568,7 +597,7 @@ public class SolrResourceLoader implemen /** * Tell all {@link ResourceLoaderAware} instances about the loader */ - public void inform( ResourceLoader loader ) + public void inform( ResourceLoader loader ) throws IOException { // make a copy to avoid potential deadlock of a callback adding to the list Modified: lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/handler/ReplicationHandler.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/handler/ReplicationHandler.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/handler/ReplicationHandler.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/handler/ReplicationHandler.java Tue Jul 31 20:58:32 2012 @@ -887,7 +887,7 @@ public class ReplicationHandler extends } // reboot the writer on the new index - core.getUpdateHandler().newIndexWriter(); + core.getUpdateHandler().newIndexWriter(true); } catch (IOException e) { LOG.warn("Unable to get IndexCommit on startup", e); Modified: lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/handler/SnapPuller.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/handler/SnapPuller.java?rev=1367777&r1=1367776&r2=1367777&view=diff ============================================================================== --- lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/handler/SnapPuller.java (original) +++ lucene/dev/branches/pforcodec_3892/solr/core/src/java/org/apache/solr/handler/SnapPuller.java Tue Jul 31 20:58:32 2012 @@ -324,7 +324,8 @@ public class SnapPuller { successfulInstall = false; boolean deleteTmpIdxDir = true; - final File indexDir = new File(core.getIndexDir()); + // make sure it's the newest known index dir... + final File indexDir = new File(core.getNewIndexDir()); Directory oldDirectory = null; try { downloadIndexFiles(isFullCopyNeeded, tmpIndexDir, latestGeneration); @@ -534,7 +535,7 @@ public class SnapPuller { SolrQueryRequest req = new LocalSolrQueryRequest(solrCore, new ModifiableSolrParams()); // reboot the writer on the new index and get a new searcher - solrCore.getUpdateHandler().newIndexWriter(); + solrCore.getUpdateHandler().newIndexWriter(true); try { // first try to open an NRT searcher so that the new