chukwa-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ey...@apache.org
Subject svn commit: r1614808 [1/7] - in /chukwa/trunk: ./ conf/ contrib/solr/ contrib/solr/logs/ contrib/solr/logs/conf/ contrib/solr/logs/conf/clustering/ contrib/solr/logs/conf/clustering/carrot2/ contrib/solr/logs/conf/lang/ contrib/solr/logs/conf/velocity/...
Date Thu, 31 Jul 2014 04:05:02 GMT
Author: eyang
Date: Thu Jul 31 04:04:59 2014
New Revision: 1614808

URL: http://svn.apache.org/r1614808
Log:
CHUKWA-722. Added SolrWriter to stream data to SolrCloud.  (Eric Yang)

Added:
    chukwa/trunk/contrib/solr/
    chukwa/trunk/contrib/solr/logs/
    chukwa/trunk/contrib/solr/logs/README.txt
    chukwa/trunk/contrib/solr/logs/conf/
    chukwa/trunk/contrib/solr/logs/conf/_schema_analysis_stopwords_english.json
    chukwa/trunk/contrib/solr/logs/conf/_schema_analysis_synonyms_english.json
    chukwa/trunk/contrib/solr/logs/conf/admin-extra.html
    chukwa/trunk/contrib/solr/logs/conf/admin-extra.menu-bottom.html
    chukwa/trunk/contrib/solr/logs/conf/admin-extra.menu-top.html
    chukwa/trunk/contrib/solr/logs/conf/clustering/
    chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/
    chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/kmeans-attributes.xml
    chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/lingo-attributes.xml
    chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/stc-attributes.xml
    chukwa/trunk/contrib/solr/logs/conf/currency.xml
    chukwa/trunk/contrib/solr/logs/conf/elevate.xml
    chukwa/trunk/contrib/solr/logs/conf/lang/
    chukwa/trunk/contrib/solr/logs/conf/lang/contractions_ca.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/contractions_fr.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/contractions_ga.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/contractions_it.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/hyphenations_ga.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stemdict_nl.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stoptags_ja.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ar.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_bg.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ca.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ckb.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_cz.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_da.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_de.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_el.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_en.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_es.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_eu.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_fa.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_fi.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_fr.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ga.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_gl.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_hi.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_hu.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_hy.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_id.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_it.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ja.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_lv.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_nl.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_no.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_pt.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ro.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ru.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_sv.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_th.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_tr.txt
    chukwa/trunk/contrib/solr/logs/conf/lang/userdict_ja.txt
    chukwa/trunk/contrib/solr/logs/conf/mapping-FoldToASCII.txt
    chukwa/trunk/contrib/solr/logs/conf/mapping-ISOLatin1Accent.txt
    chukwa/trunk/contrib/solr/logs/conf/protwords.txt
    chukwa/trunk/contrib/solr/logs/conf/schema.xml
    chukwa/trunk/contrib/solr/logs/conf/scripts.conf
    chukwa/trunk/contrib/solr/logs/conf/solrconfig.xml
    chukwa/trunk/contrib/solr/logs/conf/spellings.txt
    chukwa/trunk/contrib/solr/logs/conf/stopwords.txt
    chukwa/trunk/contrib/solr/logs/conf/synonyms.txt
    chukwa/trunk/contrib/solr/logs/conf/update-script.js
    chukwa/trunk/contrib/solr/logs/conf/velocity/
    chukwa/trunk/contrib/solr/logs/conf/velocity/README.txt
    chukwa/trunk/contrib/solr/logs/conf/velocity/VM_global_library.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/browse.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/cluster.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/cluster_results.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/debug.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/did_you_mean.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/error.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/facet_fields.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/facet_pivot.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/facet_queries.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/facet_ranges.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/facets.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/footer.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/head.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/header.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/hit.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/hit_grouped.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/hit_plain.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/join_doc.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/jquery.autocomplete.css
    chukwa/trunk/contrib/solr/logs/conf/velocity/jquery.autocomplete.js
    chukwa/trunk/contrib/solr/logs/conf/velocity/layout.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/main.css
    chukwa/trunk/contrib/solr/logs/conf/velocity/mime_type_lists.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/pagination_bottom.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/pagination_top.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/product_doc.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/query.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/query_form.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/query_group.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/query_spatial.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/results_list.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/richtext_doc.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/suggest.vm
    chukwa/trunk/contrib/solr/logs/conf/velocity/tabs.vm
    chukwa/trunk/contrib/solr/logs/conf/xslt/
    chukwa/trunk/contrib/solr/logs/conf/xslt/example.xsl
    chukwa/trunk/contrib/solr/logs/conf/xslt/example_atom.xsl
    chukwa/trunk/contrib/solr/logs/conf/xslt/example_rss.xsl
    chukwa/trunk/contrib/solr/logs/conf/xslt/luke.xsl
    chukwa/trunk/contrib/solr/logs/conf/xslt/updateXml.xsl
    chukwa/trunk/contrib/solr/logs/core.properties
    chukwa/trunk/src/main/java/org/apache/hadoop/chukwa/datacollection/writer/solr/
    chukwa/trunk/src/main/java/org/apache/hadoop/chukwa/datacollection/writer/solr/SolrWriter.java
    chukwa/trunk/src/test/java/org/apache/hadoop/chukwa/datacollection/writer/solr/
    chukwa/trunk/src/test/java/org/apache/hadoop/chukwa/datacollection/writer/solr/TestSolrWriter.java
Modified:
    chukwa/trunk/CHANGES.txt
    chukwa/trunk/conf/chukwa-agent-conf.xml
    chukwa/trunk/pom.xml
    chukwa/trunk/src/packages/tarball/all.xml

Modified: chukwa/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/CHANGES.txt?rev=1614808&r1=1614807&r2=1614808&view=diff
==============================================================================
--- chukwa/trunk/CHANGES.txt (original)
+++ chukwa/trunk/CHANGES.txt Thu Jul 31 04:04:59 2014
@@ -12,6 +12,8 @@ Release 0.6 - Unreleased
 
   NEW FEATURES
 
+    CHUKWA-722. Added SolrWriter to stream data to SolrCloud.  (Eric Yang)
+
     CHUKWA-719. Added Kerberos support for HBaseWriter.  (Sreepathi Prasanna via Eric Yang)
 
     CHUKWA-715. Added Oozie Adaptor for collecting Oozie metrics.  (Sreepathi Prasanna via Eric Yang)

Modified: chukwa/trunk/conf/chukwa-agent-conf.xml
URL: http://svn.apache.org/viewvc/chukwa/trunk/conf/chukwa-agent-conf.xml?rev=1614808&r1=1614807&r2=1614808&view=diff
==============================================================================
--- chukwa/trunk/conf/chukwa-agent-conf.xml (original)
+++ chukwa/trunk/conf/chukwa-agent-conf.xml Thu Jul 31 04:04:59 2014
@@ -97,4 +97,13 @@
     <value>HADOOP</value>
   </property>
 
+  <property>
+    <name>solr.cloud.address</name>
+    <value>localhost:2181</value>
+  </property>
+
+  <property>
+    <name>solr.collection</name>
+    <value>logs</value>
+  </property>
 </configuration>

Added: chukwa/trunk/contrib/solr/logs/README.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/README.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/README.txt (added)
+++ chukwa/trunk/contrib/solr/logs/README.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,82 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+Chukwa SolrCore Instance Directory
+=============================
+
+This directory is provided as an example of what an "Instance Directory"
+should look like for Chukwa SolrCore
+
+Basic Directory Structure
+-------------------------
+
+The Solr Home directory typically contains the following sub-directories...
+
+   conf/
+        This directory is mandatory and must contain your solrconfig.xml
+        and schema.xml.  Any other optional configuration files would also 
+        be kept here.
+
+   data/
+        This directory is the default location where Solr will keep your
+        index, and is used by the replication scripts for dealing with
+        snapshots.  You can override this location in the 
+        conf/solrconfig.xml.  Solr will create this directory if it does not 
+        already exist.
+
+   lib/
+        This directory is optional.  If it exists, Solr will load any Jars
+        found in this directory and use them to resolve any "plugins"
+        specified in your solrconfig.xml or schema.xml (ie: Analyzers,
+        Request Handlers, etc...).  Alternatively you can use the <lib>
+        syntax in conf/solrconfig.xml to direct Solr to your plugins.  See 
+        the example conf/solrconfig.xml file for details.
+
+Usage
+-----
+
+- Symlink this directory to solr-4.9.0/examples/solr/logs.
+- Start solr cloud with:
+
+  java -Dbootstrap_confdir=chukwa-0.6.0/etc/solr/logs/conf \
+       -Dcollection.configName=myconf -Djetty.port=7574 \
+       -DzkHost=localhost:2181 -jar start.jar
+
+- Configure chukwa-agent-conf.xml with pipeline that includes SolrWriter.
+
+  <property>
+    <name>chukwa.pipeline</name>
+    <value>org.apache.hadoop.chukwa.datacollection.writer.solr.SolrWriter</value>
+    <description>Configure agent to write to solr</description>
+  </property>
+
+  <property>
+    <name>solr.cloud.address</name>
+    <value>localhost:2181</value>
+    <description>Solr cloud zookeeper address</description>
+  </property>
+
+  <property>
+    <name>solr.collection</name>
+    <value>logs</value>
+    <description>SolrCore Instance name</description>
+  </property>
+
+- Restart Chukwa Agent and point browser to:
+
+  http://localhost:7574/solr/logs/select?q=*:*&wt=json&indent=true
+
+This REST API will display all collected log entries.

Added: chukwa/trunk/contrib/solr/logs/conf/_schema_analysis_stopwords_english.json
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/_schema_analysis_stopwords_english.json?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/_schema_analysis_stopwords_english.json (added)
+++ chukwa/trunk/contrib/solr/logs/conf/_schema_analysis_stopwords_english.json Thu Jul 31 04:04:59 2014
@@ -0,0 +1,38 @@
+{
+  "initArgs":{"ignoreCase":true},
+  "managedList":[
+    "a",
+    "an",
+    "and",
+    "are",
+    "as",
+    "at",
+    "be",
+    "but",
+    "by",
+    "for",
+    "if",
+    "in",
+    "into",
+    "is",
+    "it",
+    "no",
+    "not",
+    "of",
+    "on",
+    "or",
+    "stopworda",
+    "stopwordb",
+    "such",
+    "that",
+    "the",
+    "their",
+    "then",
+    "there",
+    "these",
+    "they",
+    "this",
+    "to",
+    "was",
+    "will",
+    "with"]}

Added: chukwa/trunk/contrib/solr/logs/conf/_schema_analysis_synonyms_english.json
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/_schema_analysis_synonyms_english.json?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/_schema_analysis_synonyms_english.json (added)
+++ chukwa/trunk/contrib/solr/logs/conf/_schema_analysis_synonyms_english.json Thu Jul 31 04:04:59 2014
@@ -0,0 +1,11 @@
+{
+  "initArgs":{
+    "ignoreCase":true,
+    "format":"solr"
+  },
+  "managedMap":{
+    "GB":["GiB","Gigabyte"],
+    "happy":["glad","joyful"],
+    "TV":["Television"]
+  }
+}

Added: chukwa/trunk/contrib/solr/logs/conf/admin-extra.html
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/admin-extra.html?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/admin-extra.html (added)
+++ chukwa/trunk/contrib/solr/logs/conf/admin-extra.html Thu Jul 31 04:04:59 2014
@@ -0,0 +1,24 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!-- The content of this page will be statically included into the top-
+right box of the cores overview page. Uncomment this as an example to 
+see there the content will show up.
+
+<img src="img/ico/construction.png"> This line will appear at the top-
+right box on collection1's Overview
+-->

Added: chukwa/trunk/contrib/solr/logs/conf/admin-extra.menu-bottom.html
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/admin-extra.menu-bottom.html?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/admin-extra.menu-bottom.html (added)
+++ chukwa/trunk/contrib/solr/logs/conf/admin-extra.menu-bottom.html Thu Jul 31 04:04:59 2014
@@ -0,0 +1,25 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!-- admin-extra.menu-bottom.html -->
+<!--
+<li>
+  <a href="#" style="background-image: url(img/ico/construction.png);">
+    LAST ITEM
+  </a>
+</li>
+-->

Added: chukwa/trunk/contrib/solr/logs/conf/admin-extra.menu-top.html
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/admin-extra.menu-top.html?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/admin-extra.menu-top.html (added)
+++ chukwa/trunk/contrib/solr/logs/conf/admin-extra.menu-top.html Thu Jul 31 04:04:59 2014
@@ -0,0 +1,25 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!-- admin-extra.menu-top.html -->
+<!--
+<li>
+  <a href="#" style="background-image: url(img/ico/construction.png);">
+    FIRST ITEM
+  </a>
+</li>
+-->

Added: chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/kmeans-attributes.xml
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/kmeans-attributes.xml?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/kmeans-attributes.xml (added)
+++ chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/kmeans-attributes.xml Thu Jul 31 04:04:59 2014
@@ -0,0 +1,19 @@
+<!-- 
+  Default configuration for the bisecting k-means clustering algorithm.
+  
+  This file can be loaded (and saved) by Carrot2 Workbench.
+  http://project.carrot2.org/download.html
+-->
+<attribute-sets default="attributes">
+    <attribute-set id="attributes">
+      <value-set>
+        <label>attributes</label>
+          <attribute key="MultilingualClustering.defaultLanguage">
+            <value type="org.carrot2.core.LanguageCode" value="ENGLISH"/>
+          </attribute>
+          <attribute key="MultilingualClustering.languageAggregationStrategy">
+            <value type="org.carrot2.text.clustering.MultilingualClustering$LanguageAggregationStrategy" value="FLATTEN_MAJOR_LANGUAGE"/>
+          </attribute>
+      </value-set>
+  </attribute-set>
+</attribute-sets>

Added: chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/lingo-attributes.xml
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/lingo-attributes.xml?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/lingo-attributes.xml (added)
+++ chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/lingo-attributes.xml Thu Jul 31 04:04:59 2014
@@ -0,0 +1,24 @@
+<!-- 
+  Default configuration for the Lingo clustering algorithm.
+
+  This file can be loaded (and saved) by Carrot2 Workbench.
+  http://project.carrot2.org/download.html
+-->
+<attribute-sets default="attributes">
+    <attribute-set id="attributes">
+      <value-set>
+        <label>attributes</label>
+          <!-- 
+          The language to assume for clustered documents.
+          For a list of allowed values, see: 
+          http://download.carrot2.org/stable/manual/#section.attribute.lingo.MultilingualClustering.defaultLanguage
+          -->
+          <attribute key="MultilingualClustering.defaultLanguage">
+            <value type="org.carrot2.core.LanguageCode" value="ENGLISH"/>
+          </attribute>
+          <attribute key="LingoClusteringAlgorithm.desiredClusterCountBase">
+            <value type="java.lang.Integer" value="20"/>
+          </attribute>
+      </value-set>
+  </attribute-set>
+</attribute-sets>
\ No newline at end of file

Added: chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/stc-attributes.xml
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/stc-attributes.xml?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/stc-attributes.xml (added)
+++ chukwa/trunk/contrib/solr/logs/conf/clustering/carrot2/stc-attributes.xml Thu Jul 31 04:04:59 2014
@@ -0,0 +1,19 @@
+<!-- 
+  Default configuration for the STC clustering algorithm.
+
+  This file can be loaded (and saved) by Carrot2 Workbench.
+  http://project.carrot2.org/download.html
+-->
+<attribute-sets default="attributes">
+    <attribute-set id="attributes">
+      <value-set>
+        <label>attributes</label>
+          <attribute key="MultilingualClustering.defaultLanguage">
+            <value type="org.carrot2.core.LanguageCode" value="ENGLISH"/>
+          </attribute>
+          <attribute key="MultilingualClustering.languageAggregationStrategy">
+            <value type="org.carrot2.text.clustering.MultilingualClustering$LanguageAggregationStrategy" value="FLATTEN_MAJOR_LANGUAGE"/>
+          </attribute>
+      </value-set>
+  </attribute-set>
+</attribute-sets>

Added: chukwa/trunk/contrib/solr/logs/conf/currency.xml
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/currency.xml?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/currency.xml (added)
+++ chukwa/trunk/contrib/solr/logs/conf/currency.xml Thu Jul 31 04:04:59 2014
@@ -0,0 +1,67 @@
+<?xml version="1.0" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!-- Example exchange rates file for CurrencyField type named "currency" in example schema -->
+
+<currencyConfig version="1.0">
+  <rates>
+    <!-- Updated from http://www.exchangerate.com/ at 2011-09-27 -->
+    <rate from="USD" to="ARS" rate="4.333871" comment="ARGENTINA Peso" />
+    <rate from="USD" to="AUD" rate="1.025768" comment="AUSTRALIA Dollar" />
+    <rate from="USD" to="EUR" rate="0.743676" comment="European Euro" />
+    <rate from="USD" to="BRL" rate="1.881093" comment="BRAZIL Real" />
+    <rate from="USD" to="CAD" rate="1.030815" comment="CANADA Dollar" />
+    <rate from="USD" to="CLP" rate="519.0996" comment="CHILE Peso" />
+    <rate from="USD" to="CNY" rate="6.387310" comment="CHINA Yuan" />
+    <rate from="USD" to="CZK" rate="18.47134" comment="CZECH REP. Koruna" />
+    <rate from="USD" to="DKK" rate="5.515436" comment="DENMARK Krone" />
+    <rate from="USD" to="HKD" rate="7.801922" comment="HONG KONG Dollar" />
+    <rate from="USD" to="HUF" rate="215.6169" comment="HUNGARY Forint" />
+    <rate from="USD" to="ISK" rate="118.1280" comment="ICELAND Krona" />
+    <rate from="USD" to="INR" rate="49.49088" comment="INDIA Rupee" />
+    <rate from="USD" to="XDR" rate="0.641358" comment="INTNL MON. FUND SDR" />
+    <rate from="USD" to="ILS" rate="3.709739" comment="ISRAEL Sheqel" />
+    <rate from="USD" to="JPY" rate="76.32419" comment="JAPAN Yen" />
+    <rate from="USD" to="KRW" rate="1169.173" comment="KOREA (SOUTH) Won" />
+    <rate from="USD" to="KWD" rate="0.275142" comment="KUWAIT Dinar" />
+    <rate from="USD" to="MXN" rate="13.85895" comment="MEXICO Peso" />
+    <rate from="USD" to="NZD" rate="1.285159" comment="NEW ZEALAND Dollar" />
+    <rate from="USD" to="NOK" rate="5.859035" comment="NORWAY Krone" />
+    <rate from="USD" to="PKR" rate="87.57007" comment="PAKISTAN Rupee" />
+    <rate from="USD" to="PEN" rate="2.730683" comment="PERU Sol" />
+    <rate from="USD" to="PHP" rate="43.62039" comment="PHILIPPINES Peso" />
+    <rate from="USD" to="PLN" rate="3.310139" comment="POLAND Zloty" />
+    <rate from="USD" to="RON" rate="3.100932" comment="ROMANIA Leu" />
+    <rate from="USD" to="RUB" rate="32.14663" comment="RUSSIA Ruble" />
+    <rate from="USD" to="SAR" rate="3.750465" comment="SAUDI ARABIA Riyal" />
+    <rate from="USD" to="SGD" rate="1.299352" comment="SINGAPORE Dollar" />
+    <rate from="USD" to="ZAR" rate="8.329761" comment="SOUTH AFRICA Rand" />
+    <rate from="USD" to="SEK" rate="6.883442" comment="SWEDEN Krona" />
+    <rate from="USD" to="CHF" rate="0.906035" comment="SWITZERLAND Franc" />
+    <rate from="USD" to="TWD" rate="30.40283" comment="TAIWAN Dollar" />
+    <rate from="USD" to="THB" rate="30.89487" comment="THAILAND Baht" />
+    <rate from="USD" to="AED" rate="3.672955" comment="U.A.E. Dirham" />
+    <rate from="USD" to="UAH" rate="7.988582" comment="UKRAINE Hryvnia" />
+    <rate from="USD" to="GBP" rate="0.647910" comment="UNITED KINGDOM Pound" />
+    
+    <!-- Cross-rates for some common currencies -->
+    <rate from="EUR" to="GBP" rate="0.869914" />  
+    <rate from="EUR" to="NOK" rate="7.800095" />  
+    <rate from="GBP" to="NOK" rate="8.966508" />  
+  </rates>
+</currencyConfig>

Added: chukwa/trunk/contrib/solr/logs/conf/elevate.xml
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/elevate.xml?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/elevate.xml (added)
+++ chukwa/trunk/contrib/solr/logs/conf/elevate.xml Thu Jul 31 04:04:59 2014
@@ -0,0 +1,38 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!-- If this file is found in the config directory, it will only be
+     loaded once at startup.  If it is found in Solr's data
+     directory, it will be re-loaded every commit.
+
+   See http://wiki.apache.org/solr/QueryElevationComponent for more info
+
+-->
+<elevate>
+ <query text="foo bar">
+  <doc id="1" />
+  <doc id="2" />
+  <doc id="3" />
+ </query>
+ 
+ <query text="ipod">
+   <doc id="MA147LL/A" />  <!-- put the actual ipod at the top -->
+   <doc id="IW-02" exclude="true" /> <!-- exclude this cable -->
+ </query>
+ 
+</elevate>

Added: chukwa/trunk/contrib/solr/logs/conf/lang/contractions_ca.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/contractions_ca.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/contractions_ca.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/contractions_ca.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,8 @@
+# Set of Catalan contractions for ElisionFilter
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+d
+l
+m
+n
+s
+t

Added: chukwa/trunk/contrib/solr/logs/conf/lang/contractions_fr.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/contractions_fr.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/contractions_fr.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/contractions_fr.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,15 @@
+# Set of French contractions for ElisionFilter
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+l
+m
+t
+qu
+n
+s
+j
+d
+c
+jusqu
+quoiqu
+lorsqu
+puisqu

Added: chukwa/trunk/contrib/solr/logs/conf/lang/contractions_ga.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/contractions_ga.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/contractions_ga.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/contractions_ga.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,5 @@
+# Set of Irish contractions for ElisionFilter
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+d
+m
+b

Added: chukwa/trunk/contrib/solr/logs/conf/lang/contractions_it.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/contractions_it.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/contractions_it.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/contractions_it.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,23 @@
+# Set of Italian contractions for ElisionFilter
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+c
+l 
+all 
+dall 
+dell 
+nell 
+sull 
+coll 
+pell 
+gl 
+agl 
+dagl 
+degl 
+negl 
+sugl 
+un 
+m 
+t 
+s 
+v 
+d

Added: chukwa/trunk/contrib/solr/logs/conf/lang/hyphenations_ga.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/hyphenations_ga.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/hyphenations_ga.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/hyphenations_ga.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,5 @@
+# Set of Irish hyphenations for StopFilter
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+h
+n
+t

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stemdict_nl.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stemdict_nl.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stemdict_nl.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stemdict_nl.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,6 @@
+# Set of overrides for the dutch stemmer
+# TODO: load this as a resource from the analyzer and sync it in build.xml
+fiets	fiets
+bromfiets	bromfiets
+ei	eier
+kind	kinder

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stoptags_ja.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stoptags_ja.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stoptags_ja.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stoptags_ja.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,420 @@
+#
+# This file defines a Japanese stoptag set for JapanesePartOfSpeechStopFilter.
+#
+# Any token with a part-of-speech tag that exactly matches those defined in this
+# file are removed from the token stream.
+#
+# Set your own stoptags by uncommenting the lines below.  Note that comments are
+# not allowed on the same line as a stoptag.  See LUCENE-3745 for frequency lists,
+# etc. that can be useful for building you own stoptag set.
+#
+# The entire possible tagset is provided below for convenience.
+#
+#####
+#  noun: unclassified nouns
+#名詞
+#
+#  noun-common: Common nouns or nouns where the sub-classification is undefined
+#名詞-一般
+#
+#  noun-proper: Proper nouns where the sub-classification is undefined 
+#名詞-固有名詞
+#
+#  noun-proper-misc: miscellaneous proper nouns
+#名詞-固有名詞-一般
+#
+#  noun-proper-person: Personal names where the sub-classification is undefined
+#名詞-固有名詞-人名
+#
+#  noun-proper-person-misc: names that cannot be divided into surname and 
+#  given name; foreign names; names where the surname or given name is unknown.
+#  e.g. お市の方
+#名詞-固有名詞-人名-一般
+#
+#  noun-proper-person-surname: Mainly Japanese surnames.
+#  e.g. 山田
+#名詞-固有名詞-人名-姓
+#
+#  noun-proper-person-given_name: Mainly Japanese given names.
+#  e.g. 太郎
+#名詞-固有名詞-人名-名
+#
+#  noun-proper-organization: Names representing organizations.
+#  e.g. 通産省, NHK
+#名詞-固有名詞-組織
+#
+#  noun-proper-place: Place names where the sub-classification is undefined
+#名詞-固有名詞-地域
+#
+#  noun-proper-place-misc: Place names excluding countries.
+#  e.g. アジア, バルセロナ, 京都
+#名詞-固有名詞-地域-一般
+#
+#  noun-proper-place-country: Country names. 
+#  e.g. 日本, オーストラリア
+#名詞-固有名詞-地域-国
+#
+#  noun-pronoun: Pronouns where the sub-classification is undefined
+#名詞-代名詞
+#
+#  noun-pronoun-misc: miscellaneous pronouns: 
+#  e.g. それ, ここ, あいつ, あなた, あちこち, いくつ, どこか, なに, みなさん, みんな, わたくし, われわれ
+#名詞-代名詞-一般
+#
+#  noun-pronoun-contraction: Spoken language contraction made by combining a 
+#  pronoun and the particle 'wa'.
+#  e.g. ありゃ, こりゃ, こりゃあ, そりゃ, そりゃあ 
+#名詞-代名詞-縮約
+#
+#  noun-adverbial: Temporal nouns such as names of days or months that behave 
+#  like adverbs. Nouns that represent amount or ratios and can be used adverbially,
+#  e.g. 金曜, 一月, 午後, 少量
+#名詞-副詞可能
+#
+#  noun-verbal: Nouns that take arguments with case and can appear followed by 
+#  'suru' and related verbs (する, できる, なさる, くださる)
+#  e.g. インプット, 愛着, 悪化, 悪戦苦闘, 一安心, 下取り
+#名詞-サ変接続
+#
+#  noun-adjective-base: The base form of adjectives, words that appear before な ("na")
+#  e.g. 健康, 安易, 駄目, だめ
+#名詞-形容動詞語幹
+#
+#  noun-numeric: Arabic numbers, Chinese numerals, and counters like 何 (回), 数.
+#  e.g. 0, 1, 2, 何, 数, 幾
+#名詞-数
+#
+#  noun-affix: noun affixes where the sub-classification is undefined
+#名詞-非自立
+#
+#  noun-affix-misc: Of adnominalizers, the case-marker の ("no"), and words that 
+#  attach to the base form of inflectional words, words that cannot be classified 
+#  into any of the other categories below. This category includes indefinite nouns.
+#  e.g. あかつき, 暁, かい, 甲斐, 気, きらい, 嫌い, くせ, 癖, こと, 事, ごと, 毎, しだい, 次第, 
+#       順, せい, 所為, ついで, 序で, つもり, 積もり, 点, どころ, の, はず, 筈, はずみ, 弾み, 
+#       拍子, ふう, ふり, 振り, ほう, 方, 旨, もの, 物, 者, ゆえ, 故, ゆえん, 所以, わけ, 訳,
+#       わり, 割り, 割, ん-口語/, もん-口語/
+#名詞-非自立-一般
+#
+#  noun-affix-adverbial: noun affixes that that can behave as adverbs.
+#  e.g. あいだ, 間, あげく, 挙げ句, あと, 後, 余り, 以外, 以降, 以後, 以上, 以前, 一方, うえ, 
+#       上, うち, 内, おり, 折り, かぎり, 限り, きり, っきり, 結果, ころ, 頃, さい, 際, 最中, さなか, 
+#       最中, じたい, 自体, たび, 度, ため, 為, つど, 都度, とおり, 通り, とき, 時, ところ, 所, 
+#       とたん, 途端, なか, 中, のち, 後, ばあい, 場合, 日, ぶん, 分, ほか, 他, まえ, 前, まま, 
+#       儘, 侭, みぎり, 矢先
+#名詞-非自立-副詞可能
+#
+#  noun-affix-aux: noun affixes treated as 助動詞 ("auxiliary verb") in school grammars 
+#  with the stem よう(だ) ("you(da)").
+#  e.g.  よう, やう, 様 (よう)
+#名詞-非自立-助動詞語幹
+#  
+#  noun-affix-adjective-base: noun affixes that can connect to the indeclinable
+#  connection form な (aux "da").
+#  e.g. みたい, ふう
+#名詞-非自立-形容動詞語幹
+#
+#  noun-special: special nouns where the sub-classification is undefined.
+#名詞-特殊
+#
+#  noun-special-aux: The そうだ ("souda") stem form that is used for reporting news, is 
+#  treated as 助動詞 ("auxiliary verb") in school grammars, and attach to the base 
+#  form of inflectional words.
+#  e.g. そう
+#名詞-特殊-助動詞語幹
+#
+#  noun-suffix: noun suffixes where the sub-classification is undefined.
+#名詞-接尾
+#
+#  noun-suffix-misc: Of the nouns or stem forms of other parts of speech that connect 
+#  to ガル or タイ and can combine into compound nouns, words that cannot be classified into
+#  any of the other categories below. In general, this category is more inclusive than 
+#  接尾語 ("suffix") and is usually the last element in a compound noun.
+#  e.g. おき, かた, 方, 甲斐 (がい), がかり, ぎみ, 気味, ぐるみ, (~した) さ, 次第, 済 (ず) み,
+#       よう, (でき)っこ, 感, 観, 性, 学, 類, 面, 用
+#名詞-接尾-一般
+#
+#  noun-suffix-person: Suffixes that form nouns and attach to person names more often
+#  than other nouns.
+#  e.g. 君, 様, 著
+#名詞-接尾-人名
+#
+#  noun-suffix-place: Suffixes that form nouns and attach to place names more often 
+#  than other nouns.
+#  e.g. 町, 市, 県
+#名詞-接尾-地域
+#
+#  noun-suffix-verbal: Of the suffixes that attach to nouns and form nouns, those that 
+#  can appear before スル ("suru").
+#  e.g. 化, 視, 分け, 入り, 落ち, 買い
+#名詞-接尾-サ変接続
+#
+#  noun-suffix-aux: The stem form of そうだ (様態) that is used to indicate conditions, 
+#  is treated as 助動詞 ("auxiliary verb") in school grammars, and attach to the 
+#  conjunctive form of inflectional words.
+#  e.g. そう
+#名詞-接尾-助動詞語幹
+#
+#  noun-suffix-adjective-base: Suffixes that attach to other nouns or the conjunctive 
+#  form of inflectional words and appear before the copula だ ("da").
+#  e.g. 的, げ, がち
+#名詞-接尾-形容動詞語幹
+#
+#  noun-suffix-adverbial: Suffixes that attach to other nouns and can behave as adverbs.
+#  e.g. 後 (ご), 以後, 以降, 以前, 前後, 中, 末, 上, 時 (じ)
+#名詞-接尾-副詞可能
+#
+#  noun-suffix-classifier: Suffixes that attach to numbers and form nouns. This category 
+#  is more inclusive than 助数詞 ("classifier") and includes common nouns that attach 
+#  to numbers.
+#  e.g. 個, つ, 本, 冊, パーセント, cm, kg, カ月, か国, 区画, 時間, 時半
+#名詞-接尾-助数詞
+#
+#  noun-suffix-special: Special suffixes that mainly attach to inflecting words.
+#  e.g. (楽し) さ, (考え) 方
+#名詞-接尾-特殊
+#
+#  noun-suffix-conjunctive: Nouns that behave like conjunctions and join two words 
+#  together.
+#  e.g. (日本) 対 (アメリカ), 対 (アメリカ), (3) 対 (5), (女優) 兼 (主婦)
+#名詞-接続詞的
+#
+#  noun-verbal_aux: Nouns that attach to the conjunctive particle て ("te") and are 
+#  semantically verb-like.
+#  e.g. ごらん, ご覧, 御覧, 頂戴
+#名詞-動詞非自立的
+#
+#  noun-quotation: text that cannot be segmented into words, proverbs, Chinese poetry, 
+#  dialects, English, etc. Currently, the only entry for 名詞 引用文字列 ("noun quotation") 
+#  is いわく ("iwaku").
+#名詞-引用文字列
+#
+#  noun-nai_adjective: Words that appear before the auxiliary verb ない ("nai") and
+#  behave like an adjective.
+#  e.g. 申し訳, 仕方, とんでも, 違い
+#名詞-ナイ形容詞語幹
+#
+#####
+#  prefix: unclassified prefixes
+#接頭詞
+#
+#  prefix-nominal: Prefixes that attach to nouns (including adjective stem forms) 
+#  excluding numerical expressions.
+#  e.g. お (水), 某 (氏), 同 (社), 故 (~氏), 高 (品質), お (見事), ご (立派)
+#接頭詞-名詞接続
+#
+#  prefix-verbal: Prefixes that attach to the imperative form of a verb or a verb
+#  in conjunctive form followed by なる/なさる/くださる.
+#  e.g. お (読みなさい), お (座り)
+#接頭詞-動詞接続
+#
+#  prefix-adjectival: Prefixes that attach to adjectives.
+#  e.g. お (寒いですねえ), バカ (でかい)
+#接頭詞-形容詞接続
+#
+#  prefix-numerical: Prefixes that attach to numerical expressions.
+#  e.g. 約, およそ, 毎時
+#接頭詞-数接続
+#
+#####
+#  verb: unclassified verbs
+#動詞
+#
+#  verb-main:
+#動詞-自立
+#
+#  verb-auxiliary:
+#動詞-非自立
+#
+#  verb-suffix:
+#動詞-接尾
+#
+#####
+#  adjective: unclassified adjectives
+#形容詞
+#
+#  adjective-main:
+#形容詞-自立
+#
+#  adjective-auxiliary:
+#形容詞-非自立
+#
+#  adjective-suffix:
+#形容詞-接尾
+#
+#####
+#  adverb: unclassified adverbs
+#副詞
+#
+#  adverb-misc: Words that can be segmented into one unit and where adnominal 
+#  modification is not possible.
+#  e.g. あいかわらず, 多分
+#副詞-一般
+#
+#  adverb-particle_conjunction: Adverbs that can be followed by の, は, に, 
+#  な, する, だ, etc.
+#  e.g. こんなに, そんなに, あんなに, なにか, なんでも
+#副詞-助詞類接続
+#
+#####
+#  adnominal: Words that only have noun-modifying forms.
+#  e.g. この, その, あの, どの, いわゆる, なんらかの, 何らかの, いろんな, こういう, そういう, ああいう, 
+#       どういう, こんな, そんな, あんな, どんな, 大きな, 小さな, おかしな, ほんの, たいした, 
+#       「(, も) さる (ことながら)」, 微々たる, 堂々たる, 単なる, いかなる, 我が」「同じ, 亡き
+#連体詞
+#
+#####
+#  conjunction: Conjunctions that can occur independently.
+#  e.g. が, けれども, そして, じゃあ, それどころか
+接続詞
+#
+#####
+#  particle: unclassified particles.
+助詞
+#
+#  particle-case: case particles where the subclassification is undefined.
+助詞-格助詞
+#
+#  particle-case-misc: Case particles.
+#  e.g. から, が, で, と, に, へ, より, を, の, にて
+助詞-格助詞-一般
+#
+#  particle-case-quote: the "to" that appears after nouns, a person’s speech, 
+#  quotation marks, expressions of decisions from a meeting, reasons, judgements,
+#  conjectures, etc.
+#  e.g. ( だ) と (述べた.), ( である) と (して執行猶予...)
+助詞-格助詞-引用
+#
+#  particle-case-compound: Compounds of particles and verbs that mainly behave 
+#  like case particles.
+#  e.g. という, といった, とかいう, として, とともに, と共に, でもって, にあたって, に当たって, に当って,
+#       にあたり, に当たり, に当り, に当たる, にあたる, において, に於いて,に於て, における, に於ける, 
+#       にかけ, にかけて, にかんし, に関し, にかんして, に関して, にかんする, に関する, に際し, 
+#       に際して, にしたがい, に従い, に従う, にしたがって, に従って, にたいし, に対し, にたいして, 
+#       に対して, にたいする, に対する, について, につき, につけ, につけて, につれ, につれて, にとって,
+#       にとり, にまつわる, によって, に依って, に因って, により, に依り, に因り, による, に依る, に因る, 
+#       にわたって, にわたる, をもって, を以って, を通じ, を通じて, を通して, をめぐって, をめぐり, をめぐる,
+#       って-口語/, ちゅう-関西弁「という」/, (何) ていう (人)-口語/, っていう-口語/, といふ, とかいふ
+助詞-格助詞-連語
+#
+#  particle-conjunctive:
+#  e.g. から, からには, が, けれど, けれども, けど, し, つつ, て, で, と, ところが, どころか, とも, ども, 
+#       ながら, なり, ので, のに, ば, ものの, や ( した), やいなや, (ころん) じゃ(いけない)-口語/, 
+#       (行っ) ちゃ(いけない)-口語/, (言っ) たって (しかたがない)-口語/, (それがなく)ったって (平気)-口語/
+助詞-接続助詞
+#
+#  particle-dependency:
+#  e.g. こそ, さえ, しか, すら, は, も, ぞ
+助詞-係助詞
+#
+#  particle-adverbial:
+#  e.g. がてら, かも, くらい, 位, ぐらい, しも, (学校) じゃ(これが流行っている)-口語/, 
+#       (それ)じゃあ (よくない)-口語/, ずつ, (私) なぞ, など, (私) なり (に), (先生) なんか (大嫌い)-口語/,
+#       (私) なんぞ, (先生) なんて (大嫌い)-口語/, のみ, だけ, (私) だって-口語/, だに, 
+#       (彼)ったら-口語/, (お茶) でも (いかが), 等 (とう), (今後) とも, ばかり, ばっか-口語/, ばっかり-口語/,
+#       ほど, 程, まで, 迄, (誰) も (が)([助詞-格助詞] および [助詞-係助詞] の前に位置する「も」)
+助詞-副助詞
+#
+#  particle-interjective: particles with interjective grammatical roles.
+#  e.g. (松島) や
+助詞-間投助詞
+#
+#  particle-coordinate:
+#  e.g. と, たり, だの, だり, とか, なり, や, やら
+助詞-並立助詞
+#
+#  particle-final:
+#  e.g. かい, かしら, さ, ぜ, (だ)っけ-口語/, (とまってる) で-方言/, な, ナ, なあ-口語/, ぞ, ね, ネ, 
+#       ねぇ-口語/, ねえ-口語/, ねん-方言/, の, のう-口語/, や, よ, ヨ, よぉ-口語/, わ, わい-口語/
+助詞-終助詞
+#
+#  particle-adverbial/conjunctive/final: The particle "ka" when unknown whether it is 
+#  adverbial, conjunctive, or sentence final. For example:
+#       (a) 「A か B か」. Ex:「(国内で運用する) か,(海外で運用する) か (.)」
+#       (b) Inside an adverb phrase. Ex:「(幸いという) か (, 死者はいなかった.)」
+#           「(祈りが届いたせい) か (, 試験に合格した.)」
+#       (c) 「かのように」. Ex:「(何もなかった) か (のように振る舞った.)」
+#  e.g. か
+助詞-副助詞/並立助詞/終助詞
+#
+#  particle-adnominalizer: The "no" that attaches to nouns and modifies 
+#  non-inflectional words.
+助詞-連体化
+#
+#  particle-adnominalizer: The "ni" and "to" that appear following nouns and adverbs 
+#  that are giongo, giseigo, or gitaigo.
+#  e.g. に, と
+助詞-副詞化
+#
+#  particle-special: A particle that does not fit into one of the above classifications. 
+#  This includes particles that are used in Tanka, Haiku, and other poetry.
+#  e.g. かな, けむ, ( しただろう) に, (あんた) にゃ(わからん), (俺) ん (家)
+助詞-特殊
+#
+#####
+#  auxiliary-verb:
+助動詞
+#
+#####
+#  interjection: Greetings and other exclamations.
+#  e.g. おはよう, おはようございます, こんにちは, こんばんは, ありがとう, どうもありがとう, ありがとうございます, 
+#       いただきます, ごちそうさま, さよなら, さようなら, はい, いいえ, ごめん, ごめんなさい
+#感動詞
+#
+#####
+#  symbol: unclassified Symbols.
+記号
+#
+#  symbol-misc: A general symbol not in one of the categories below.
+#  e.g. [○◎@$〒→+]
+記号-一般
+#
+#  symbol-comma: Commas
+#  e.g. [,、]
+記号-読点
+#
+#  symbol-period: Periods and full stops.
+#  e.g. [..。]
+記号-句点
+#
+#  symbol-space: Full-width whitespace.
+記号-空白
+#
+#  symbol-open_bracket:
+#  e.g. [({‘“『【]
+記号-括弧開
+#
+#  symbol-close_bracket:
+#  e.g. [)}’”』」】]
+記号-括弧閉
+#
+#  symbol-alphabetic:
+#記号-アルファベット
+#
+#####
+#  other: unclassified other
+#その他
+#
+#  other-interjection: Words that are hard to classify as noun-suffixes or 
+#  sentence-final particles.
+#  e.g. (だ)ァ
+その他-間投
+#
+#####
+#  filler: Aizuchi that occurs during a conversation or sounds inserted as filler.
+#  e.g. あの, うんと, えと
+フィラー
+#
+#####
+#  non-verbal: non-verbal sound.
+非言語音
+#
+#####
+#  fragment:
+#語断片
+#
+#####
+#  unknown: unknown part of speech.
+#未知語
+#
+##### End of file

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ar.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ar.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ar.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ar.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,125 @@
+# This file was created by Jacques Savoy and is distributed under the BSD license.
+# See http://members.unine.ch/jacques.savoy/clef/index.html.
+# Also see http://www.opensource.org/licenses/bsd-license.html
+# Cleaned on October 11, 2009 (not normalized, so use before normalization)
+# This means that when modifying this list, you might need to add some 
+# redundant entries, for example containing forms with both أ and ا
+من
+ومن
+منها
+منه
+في
+وفي
+فيها
+فيه
+و
+ف
+ثم
+او
+أو
+ب
+بها
+به
+ا
+Ø£
+اى
+اي
+أي
+أى
+لا
+ولا
+الا
+ألا
+إلا
+لكن
+ما
+وما
+كما
+فما
+عن
+مع
+اذا
+إذا
+ان
+أن
+إن
+انها
+أنها
+إنها
+انه
+أنه
+إنه
+بان
+بأن
+فان
+فأن
+وان
+وأن
+وإن
+التى
+التي
+الذى
+الذي
+الذين
+الى
+الي
+إلى
+إلي
+على
+عليها
+عليه
+اما
+أما
+إما
+ايضا
+أيضا
+كل
+وكل
+لم
+ولم
+لن
+ولن
+هى
+هي
+هو
+وهى
+وهي
+وهو
+فهى
+فهي
+فهو
+انت
+أنت
+لك
+لها
+له
+هذه
+هذا
+تلك
+ذلك
+هناك
+كانت
+كان
+يكون
+تكون
+وكانت
+وكان
+غير
+بعض
+قد
+نحو
+بين
+بينما
+منذ
+ضمن
+حيث
+الان
+الآن
+خلال
+بعد
+قبل
+حتى
+عند
+عندما
+لدى
+جميع

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_bg.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_bg.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_bg.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_bg.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,193 @@
+# This file was created by Jacques Savoy and is distributed under the BSD license.
+# See http://members.unine.ch/jacques.savoy/clef/index.html.
+# Also see http://www.opensource.org/licenses/bsd-license.html
+а
+аз
+ако
+ала
+бе
+без
+беше
+би
+бил
+била
+били
+било
+близо
+бъдат
+бъде
+бяха
+в
+вас
+ваш
+ваша
+вероятно
+вече
+взема
+ви
+вие
+винаги
+все
+всеки
+всички
+всичко
+всяка
+във
+въпреки
+върху
+г
+ги
+главно
+го
+д
+да
+дали
+до
+докато
+докога
+дори
+досега
+доста
+е
+едва
+един
+ето
+за
+зад
+заедно
+заради
+засега
+затова
+защо
+защото
+и
+из
+или
+им
+има
+имат
+иска
+й
+каза
+как
+каква
+какво
+както
+какъв
+като
+кога
+когато
+което
+които
+кой
+който
+колко
+която
+къде
+където
+към
+ли
+м
+ме
+между
+мен
+ми
+мнозина
+мога
+могат
+може
+моля
+момента
+му
+н
+на
+над
+назад
+най
+направи
+напред
+например
+нас
+не
+него
+нея
+ни
+ние
+никой
+нито
+но
+някои
+някой
+няма
+обаче
+около
+освен
+особено
+от
+отгоре
+отново
+още
+пак
+по
+повече
+повечето
+под
+поне
+поради
+после
+почти
+прави
+пред
+преди
+през
+при
+пък
+първо
+с
+са
+само
+се
+сега
+си
+скоро
+след
+сме
+според
+сред
+срещу
+сте
+съм
+със
+също
+т
+тази
+така
+такива
+такъв
+там
+твой
+те
+тези
+ти
+тн
+то
+това
+тогава
+този
+той
+толкова
+точно
+трябва
+тук
+тъй
+тя
+тях
+у
+харесва
+ч
+че
+често
+чрез
+ще
+щом
+я

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ca.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ca.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ca.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ca.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,220 @@
+# Catalan stopwords from http://github.com/vcl/cue.language (Apache 2 Licensed)
+a
+abans
+ací
+ah
+així
+això
+al
+als
+aleshores
+algun
+alguna
+algunes
+alguns
+alhora
+allà
+allí
+allò
+altra
+altre
+altres
+amb
+ambdós
+ambdues
+apa
+aquell
+aquella
+aquelles
+aquells
+aquest
+aquesta
+aquestes
+aquests
+aquí
+baix
+cada
+cadascú
+cadascuna
+cadascunes
+cadascuns
+com
+contra
+d'un
+d'una
+d'unes
+d'uns
+dalt
+de
+del
+dels
+des
+després
+dins
+dintre
+donat
+doncs
+durant
+e
+eh
+el
+els
+em
+en
+encara
+ens
+entre
+érem
+eren
+éreu
+es
+és
+esta
+està
+estàvem
+estaven
+estàveu
+esteu
+et
+etc
+ets
+fins
+fora
+gairebé
+ha
+han
+has
+havia
+he
+hem
+heu
+hi 
+ho
+i
+igual
+iguals
+ja
+l'hi
+la
+les
+li
+li'n
+llavors
+m'he
+ma
+mal
+malgrat
+mateix
+mateixa
+mateixes
+mateixos
+me
+mentre
+més
+meu
+meus
+meva
+meves
+molt
+molta
+moltes
+molts
+mon
+mons
+n'he
+n'hi
+ne
+ni
+no
+nogensmenys
+només
+nosaltres
+nostra
+nostre
+nostres
+o
+oh
+oi
+on
+pas
+pel
+pels
+per
+però
+perquè
+poc 
+poca
+pocs
+poques
+potser
+propi
+qual
+quals
+quan
+quant 
+que
+què
+quelcom
+qui
+quin
+quina
+quines
+quins
+s'ha
+s'han
+sa
+semblant
+semblants
+ses
+seu 
+seus
+seva
+seva
+seves
+si
+sobre
+sobretot
+sóc
+solament
+sols
+son 
+són
+sons 
+sota
+sou
+t'ha
+t'han
+t'he
+ta
+tal
+també
+tampoc
+tan
+tant
+tanta
+tantes
+teu
+teus
+teva
+teves
+ton
+tons
+tot
+tota
+totes
+tots
+un
+una
+unes
+uns
+us
+va
+vaig
+vam
+van
+vas
+veu
+vosaltres
+vostra
+vostre
+vostres

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ckb.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ckb.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ckb.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_ckb.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,136 @@
+# set of kurdish stopwords
+# note these have been normalized with our scheme (e represented with U+06D5, etc)
+# constructed from:
+# * Fig 5 of "Building A Test Collection For Sorani Kurdish" (Esmaili et al)
+# * "Sorani Kurdish: A Reference Grammar with selected readings" (Thackston)
+# * Corpus-based analysis of 77M word Sorani collection: wikipedia, news, blogs, etc
+
+# and
+و
+# which
+کە
+# of
+ی
+# made/did
+کرد
+# that/which
+ئەوەی
+# on/head
+سەر
+# two
+دوو
+# also
+هەروەها
+# from/that
+لەو
+# makes/does
+دەکات
+# some
+چەند
+# every
+هەر
+
+# demonstratives
+# that
+ئەو
+# this
+ئەم
+
+# personal pronouns
+# I
+من
+# we
+ئێمە
+# you
+تۆ
+# you
+ئێوە
+# he/she/it
+ئەو
+# they
+ئەوان
+
+# prepositions
+# to/with/by
+بە
+پێ
+# without
+بەبێ
+# along with/while/during
+بەدەم
+# in the opinion of
+بەلای
+# according to
+بەپێی
+# before
+بەرلە
+# in the direction of
+بەرەوی
+# in front of/toward
+بەرەوە
+# before/in the face of
+بەردەم
+# without
+بێ
+# except for
+بێجگە
+# for
+بۆ
+# on/in
+دە
+تێ
+# with
+دەگەڵ
+# after
+دوای
+# except for/aside from
+جگە
+# in/from
+لە
+لێ
+# in front of/before/because of
+لەبەر
+# between/among
+لەبەینی
+# concerning/about
+لەبابەت
+# concerning
+لەبارەی
+# instead of
+لەباتی
+# beside
+لەبن
+# instead of
+لەبرێتی
+# behind
+لەدەم
+# with/together with
+لەگەڵ
+# by
+لەلایەن
+# within
+لەناو
+# between/among
+لەنێو
+# for the sake of
+لەپێناوی
+# with respect to
+لەرەوی
+# by means of/for
+لەرێ
+# for the sake of
+لەرێگا
+# on/on top of/according to
+لەسەر
+# under
+لەژێر
+# between/among
+ناو
+# between/among
+نێوان
+# after
+پاش
+# before
+پێش
+# like
+وەک

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_cz.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_cz.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_cz.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_cz.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,172 @@
+a
+s
+k
+o
+i
+u
+v
+z
+dnes
+cz
+tímto
+budeš
+budem
+byli
+jseš
+můj
+svým
+ta
+tomto
+tohle
+tuto
+tyto
+jej
+zda
+proč
+máte
+tato
+kam
+tohoto
+kdo
+kteří
+mi
+nám
+tom
+tomuto
+mít
+nic
+proto
+kterou
+byla
+toho
+protože
+asi
+ho
+naši
+napište
+re
+což
+tím
+takže
+svých
+její
+svými
+jste
+aj
+tu
+tedy
+teto
+bylo
+kde
+ke
+pravé
+ji
+nad
+nejsou
+či
+pod
+téma
+mezi
+přes
+ty
+pak
+vám
+ani
+když
+však
+neg
+jsem
+tento
+článku
+články
+aby
+jsme
+před
+pta
+jejich
+byl
+ještě
+až
+bez
+také
+pouze
+první
+vaše
+která
+nás
+nový
+tipy
+pokud
+může
+strana
+jeho
+své
+jiné
+zprávy
+nové
+není
+vás
+jen
+podle
+zde
+už
+být
+více
+bude
+již
+než
+který
+by
+které
+co
+nebo
+ten
+tak
+má
+při
+od
+po
+jsou
+jak
+další
+ale
+si
+se
+ve
+to
+jako
+za
+zpět
+ze
+do
+pro
+je
+na
+atd
+atp
+jakmile
+přičemž
+já
+on
+ona
+ono
+oni
+ony
+my
+vy
+jí
+ji
+mě
+mne
+jemu
+tomu
+těm
+těmu
+němu
+němuž
+jehož
+jíž
+jelikož
+jež
+jakož
+načež

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_da.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_da.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_da.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_da.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,110 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/danish/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+ |
+ | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
+
+ | A Danish stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+ | This is a ranked list (commonest to rarest) of stopwords derived from
+ | a large text sample.
+
+
+og           | and
+i            | in
+jeg          | I
+det          | that (dem. pronoun)/it (pers. pronoun)
+at           | that (in front of a sentence)/to (with infinitive)
+en           | a/an
+den          | it (pers. pronoun)/that (dem. pronoun)
+til          | to/at/for/until/against/by/of/into, more
+er           | present tense of "to be"
+som          | who, as
+på           | on/upon/in/on/at/to/after/of/with/for, on
+de           | they
+med          | with/by/in, along
+han          | he
+af           | of/by/from/off/for/in/with/on, off
+for          | at/for/to/from/by/of/ago, in front/before, because
+ikke         | not
+der          | who/which, there/those
+var          | past tense of "to be"
+mig          | me/myself
+sig          | oneself/himself/herself/itself/themselves
+men          | but
+et           | a/an/one, one (number), someone/somebody/one
+har          | present tense of "to have"
+om           | round/about/for/in/a, about/around/down, if
+vi           | we
+min          | my
+havde        | past tense of "to have"
+ham          | him
+hun          | she
+nu           | now
+over         | over/above/across/by/beyond/past/on/about, over/past
+da           | then, when/as/since
+fra          | from/off/since, off, since
+du           | you
+ud           | out
+sin          | his/her/its/one's
+dem          | them
+os           | us/ourselves
+op           | up
+man          | you/one
+hans         | his
+hvor         | where
+eller        | or
+hvad         | what
+skal         | must/shall etc.
+selv         | myself/youself/herself/ourselves etc., even
+her          | here
+alle         | all/everyone/everybody etc.
+vil          | will (verb)
+blev         | past tense of "to stay/to remain/to get/to become"
+kunne        | could
+ind          | in
+når          | when
+være         | present tense of "to be"
+dog          | however/yet/after all
+noget        | something
+ville        | would
+jo           | you know/you see (adv), yes
+deres        | their/theirs
+efter        | after/behind/according to/for/by/from, later/afterwards
+ned          | down
+skulle       | should
+denne        | this
+end          | than
+dette        | this
+mit          | my/mine
+også         | also
+under        | under/beneath/below/during, below/underneath
+have         | have
+dig          | you
+anden        | other
+hende        | her
+mine         | my
+alt          | everything
+meget        | much/very, plenty of
+sit          | his, her, its, one's
+sine         | his, her, its, one's
+vor          | our
+mod          | against
+disse        | these
+hvis         | if
+din          | your/yours
+nogle        | some
+hos          | by/at
+blive        | be/become
+mange        | many
+ad           | by/through
+bliver       | present tense of "to be/to become"
+hendes       | her/hers
+været        | be
+thi          | for (conj)
+jer          | you
+sådan        | such, like this/like that

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_de.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_de.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_de.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_de.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,294 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/german/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+ |
+ | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
+
+ | A German stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+ | The number of forms in this list is reduced significantly by passing it
+ | through the German stemmer.
+
+
+aber           |  but
+
+alle           |  all
+allem
+allen
+aller
+alles
+
+als            |  than, as
+also           |  so
+am             |  an + dem
+an             |  at
+
+ander          |  other
+andere
+anderem
+anderen
+anderer
+anderes
+anderm
+andern
+anderr
+anders
+
+auch           |  also
+auf            |  on
+aus            |  out of
+bei            |  by
+bin            |  am
+bis            |  until
+bist           |  art
+da             |  there
+damit          |  with it
+dann           |  then
+
+der            |  the
+den
+des
+dem
+die
+das
+
+daß            |  that
+
+derselbe       |  the same
+derselben
+denselben
+desselben
+demselben
+dieselbe
+dieselben
+dasselbe
+
+dazu           |  to that
+
+dein           |  thy
+deine
+deinem
+deinen
+deiner
+deines
+
+denn           |  because
+
+derer          |  of those
+dessen         |  of him
+
+dich           |  thee
+dir            |  to thee
+du             |  thou
+
+dies           |  this
+diese
+diesem
+diesen
+dieser
+dieses
+
+
+doch           |  (several meanings)
+dort           |  (over) there
+
+
+durch          |  through
+
+ein            |  a
+eine
+einem
+einen
+einer
+eines
+
+einig          |  some
+einige
+einigem
+einigen
+einiger
+einiges
+
+einmal         |  once
+
+er             |  he
+ihn            |  him
+ihm            |  to him
+
+es             |  it
+etwas          |  something
+
+euer           |  your
+eure
+eurem
+euren
+eurer
+eures
+
+für            |  for
+gegen          |  towards
+gewesen        |  p.p. of sein
+hab            |  have
+habe           |  have
+haben          |  have
+hat            |  has
+hatte          |  had
+hatten         |  had
+hier           |  here
+hin            |  there
+hinter         |  behind
+
+ich            |  I
+mich           |  me
+mir            |  to me
+
+
+ihr            |  you, to her
+ihre
+ihrem
+ihren
+ihrer
+ihres
+euch           |  to you
+
+im             |  in + dem
+in             |  in
+indem          |  while
+ins            |  in + das
+ist            |  is
+
+jede           |  each, every
+jedem
+jeden
+jeder
+jedes
+
+jene           |  that
+jenem
+jenen
+jener
+jenes
+
+jetzt          |  now
+kann           |  can
+
+kein           |  no
+keine
+keinem
+keinen
+keiner
+keines
+
+können         |  can
+könnte         |  could
+machen         |  do
+man            |  one
+
+manche         |  some, many a
+manchem
+manchen
+mancher
+manches
+
+mein           |  my
+meine
+meinem
+meinen
+meiner
+meines
+
+mit            |  with
+muss           |  must
+musste         |  had to
+nach           |  to(wards)
+nicht          |  not
+nichts         |  nothing
+noch           |  still, yet
+nun            |  now
+nur            |  only
+ob             |  whether
+oder           |  or
+ohne           |  without
+sehr           |  very
+
+sein           |  his
+seine
+seinem
+seinen
+seiner
+seines
+
+selbst         |  self
+sich           |  herself
+
+sie            |  they, she
+ihnen          |  to them
+
+sind           |  are
+so             |  so
+
+solche         |  such
+solchem
+solchen
+solcher
+solches
+
+soll           |  shall
+sollte         |  should
+sondern        |  but
+sonst          |  else
+über           |  over
+um             |  about, around
+und            |  and
+
+uns            |  us
+unse
+unsem
+unsen
+unser
+unses
+
+unter          |  under
+viel           |  much
+vom            |  von + dem
+von            |  from
+vor            |  before
+während        |  while
+war            |  was
+waren          |  were
+warst          |  wast
+was            |  what
+weg            |  away, off
+weil           |  because
+weiter         |  further
+
+welche         |  which
+welchem
+welchen
+welcher
+welches
+
+wenn           |  when
+werde          |  will
+werden         |  will
+wie            |  how
+wieder         |  again
+will           |  want
+wir            |  we
+wird           |  will
+wirst          |  willst
+wo             |  where
+wollen         |  want
+wollte         |  wanted
+würde          |  would
+würden         |  would
+zu             |  to
+zum            |  zu + dem
+zur            |  zu + der
+zwar           |  indeed
+zwischen       |  between
+

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_el.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_el.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_el.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_el.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,78 @@
+# Lucene Greek Stopwords list
+# Note: by default this file is used after GreekLowerCaseFilter,
+# so when modifying this file use 'σ' instead of 'ς' 
+ο
+η
+το
+οι
+τα
+του
+τησ
+των
+τον
+την
+και 
+κι
+κ
+ειμαι
+εισαι
+ειναι
+ειμαστε
+ειστε
+στο
+στον
+στη
+στην
+μα
+αλλα
+απο
+για
+προσ
+με
+σε
+ωσ
+παρα
+αντι
+κατα
+μετα
+θα
+να
+δε
+δεν
+μη
+μην
+επι
+ενω
+εαν
+αν
+τοτε
+που
+πωσ
+ποιοσ
+ποια
+ποιο
+ποιοι
+ποιεσ
+ποιων
+ποιουσ
+αυτοσ
+αυτη
+αυτο
+αυτοι
+αυτων
+αυτουσ
+αυτεσ
+αυτα
+εκεινοσ
+εκεινη
+εκεινο
+εκεινοι
+εκεινεσ
+εκεινα
+εκεινων
+εκεινουσ
+οπωσ
+ομωσ
+ισωσ
+οσο
+οτι

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_en.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_en.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_en.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_en.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# a couple of test stopwords to test that the words are really being
+# configured from this file:
+stopworda
+stopwordb
+
+# Standard english stop words taken from Lucene's StopAnalyzer
+a
+an
+and
+are
+as
+at
+be
+but
+by
+for
+if
+in
+into
+is
+it
+no
+not
+of
+on
+or
+such
+that
+the
+their
+then
+there
+these
+they
+this
+to
+was
+will
+with

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_es.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_es.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_es.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_es.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,356 @@
+ | From svn.tartarus.org/snowball/trunk/website/algorithms/spanish/stop.txt
+ | This file is distributed under the BSD License.
+ | See http://snowball.tartarus.org/license.php
+ | Also see http://www.opensource.org/licenses/bsd-license.html
+ |  - Encoding was converted to UTF-8.
+ |  - This notice was added.
+ |
+ | NOTE: To use this file with StopFilterFactory, you must specify format="snowball"
+
+ | A Spanish stop word list. Comments begin with vertical bar. Each stop
+ | word is at the start of a line.
+
+
+ | The following is a ranked list (commonest to rarest) of stopwords
+ | deriving from a large sample of text.
+
+ | Extra words have been added at the end.
+
+de             |  from, of
+la             |  the, her
+que            |  who, that
+el             |  the
+en             |  in
+y              |  and
+a              |  to
+los            |  the, them
+del            |  de + el
+se             |  himself, from him etc
+las            |  the, them
+por            |  for, by, etc
+un             |  a
+para           |  for
+con            |  with
+no             |  no
+una            |  a
+su             |  his, her
+al             |  a + el
+  | es         from SER
+lo             |  him
+como           |  how
+más            |  more
+pero           |  pero
+sus            |  su plural
+le             |  to him, her
+ya             |  already
+o              |  or
+  | fue        from SER
+este           |  this
+  | ha         from HABER
+sí             |  himself etc
+porque         |  because
+esta           |  this
+  | son        from SER
+entre          |  between
+  | está     from ESTAR
+cuando         |  when
+muy            |  very
+sin            |  without
+sobre          |  on
+  | ser        from SER
+  | tiene      from TENER
+también        |  also
+me             |  me
+hasta          |  until
+hay            |  there is/are
+donde          |  where
+  | han        from HABER
+quien          |  whom, that
+  | están      from ESTAR
+  | estado     from ESTAR
+desde          |  from
+todo           |  all
+nos            |  us
+durante        |  during
+  | estados    from ESTAR
+todos          |  all
+uno            |  a
+les            |  to them
+ni             |  nor
+contra         |  against
+otros          |  other
+  | fueron     from SER
+ese            |  that
+eso            |  that
+  | había      from HABER
+ante           |  before
+ellos          |  they
+e              |  and (variant of y)
+esto           |  this
+mí             |  me
+antes          |  before
+algunos        |  some
+qué            |  what?
+unos           |  a
+yo             |  I
+otro           |  other
+otras          |  other
+otra           |  other
+él             |  he
+tanto          |  so much, many
+esa            |  that
+estos          |  these
+mucho          |  much, many
+quienes        |  who
+nada           |  nothing
+muchos         |  many
+cual           |  who
+  | sea        from SER
+poco           |  few
+ella           |  she
+estar          |  to be
+  | haber      from HABER
+estas          |  these
+  | estaba     from ESTAR
+  | estamos    from ESTAR
+algunas        |  some
+algo           |  something
+nosotros       |  we
+
+      | other forms
+
+mi             |  me
+mis            |  mi plural
+tú             |  thou
+te             |  thee
+ti             |  thee
+tu             |  thy
+tus            |  tu plural
+ellas          |  they
+nosotras       |  we
+vosotros       |  you
+vosotras       |  you
+os             |  you
+mío            |  mine
+mía            |
+míos           |
+mías           |
+tuyo           |  thine
+tuya           |
+tuyos          |
+tuyas          |
+suyo           |  his, hers, theirs
+suya           |
+suyos          |
+suyas          |
+nuestro        |  ours
+nuestra        |
+nuestros       |
+nuestras       |
+vuestro        |  yours
+vuestra        |
+vuestros       |
+vuestras       |
+esos           |  those
+esas           |  those
+
+               | forms of estar, to be (not including the infinitive):
+estoy
+estás
+está
+estamos
+estáis
+están
+esté
+estés
+estemos
+estéis
+estén
+estaré
+estarás
+estará
+estaremos
+estaréis
+estarán
+estaría
+estarías
+estaríamos
+estaríais
+estarían
+estaba
+estabas
+estábamos
+estabais
+estaban
+estuve
+estuviste
+estuvo
+estuvimos
+estuvisteis
+estuvieron
+estuviera
+estuvieras
+estuviéramos
+estuvierais
+estuvieran
+estuviese
+estuvieses
+estuviésemos
+estuvieseis
+estuviesen
+estando
+estado
+estada
+estados
+estadas
+estad
+
+               | forms of haber, to have (not including the infinitive):
+he
+has
+ha
+hemos
+habéis
+han
+haya
+hayas
+hayamos
+hayáis
+hayan
+habré
+habrás
+habrá
+habremos
+habréis
+habrán
+habría
+habrías
+habríamos
+habríais
+habrían
+había
+habías
+habíamos
+habíais
+habían
+hube
+hubiste
+hubo
+hubimos
+hubisteis
+hubieron
+hubiera
+hubieras
+hubiéramos
+hubierais
+hubieran
+hubiese
+hubieses
+hubiésemos
+hubieseis
+hubiesen
+habiendo
+habido
+habida
+habidos
+habidas
+
+               | forms of ser, to be (not including the infinitive):
+soy
+eres
+es
+somos
+sois
+son
+sea
+seas
+seamos
+seáis
+sean
+seré
+serás
+será
+seremos
+seréis
+serán
+sería
+serías
+seríamos
+seríais
+serían
+era
+eras
+éramos
+erais
+eran
+fui
+fuiste
+fue
+fuimos
+fuisteis
+fueron
+fuera
+fueras
+fuéramos
+fuerais
+fueran
+fuese
+fueses
+fuésemos
+fueseis
+fuesen
+siendo
+sido
+  |  sed also means 'thirst'
+
+               | forms of tener, to have (not including the infinitive):
+tengo
+tienes
+tiene
+tenemos
+tenéis
+tienen
+tenga
+tengas
+tengamos
+tengáis
+tengan
+tendré
+tendrás
+tendrá
+tendremos
+tendréis
+tendrán
+tendría
+tendrías
+tendríamos
+tendríais
+tendrían
+tenía
+tenías
+teníamos
+teníais
+tenían
+tuve
+tuviste
+tuvo
+tuvimos
+tuvisteis
+tuvieron
+tuviera
+tuvieras
+tuviéramos
+tuvierais
+tuvieran
+tuviese
+tuvieses
+tuviésemos
+tuvieseis
+tuviesen
+teniendo
+tenido
+tenida
+tenidos
+tenidas
+tened
+

Added: chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_eu.txt
URL: http://svn.apache.org/viewvc/chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_eu.txt?rev=1614808&view=auto
==============================================================================
--- chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_eu.txt (added)
+++ chukwa/trunk/contrib/solr/logs/conf/lang/stopwords_eu.txt Thu Jul 31 04:04:59 2014
@@ -0,0 +1,99 @@
+# example set of basque stopwords
+al
+anitz
+arabera
+asko
+baina
+bat
+batean
+batek
+bati
+batzuei
+batzuek
+batzuetan
+batzuk
+bera
+beraiek
+berau
+berauek
+bere
+berori
+beroriek
+beste
+bezala
+da
+dago
+dira
+ditu
+du
+dute
+edo
+egin
+ere
+eta
+eurak
+ez
+gainera
+gu
+gutxi
+guzti
+haiei
+haiek
+haietan
+hainbeste
+hala
+han
+handik
+hango
+hara
+hari
+hark
+hartan
+hau
+hauei
+hauek
+hauetan
+hemen
+hemendik
+hemengo
+hi
+hona
+honek
+honela
+honetan
+honi
+hor
+hori
+horiei
+horiek
+horietan
+horko
+horra
+horrek
+horrela
+horretan
+horri
+hortik
+hura
+izan
+ni
+noiz
+nola
+non
+nondik
+nongo
+nor
+nora
+ze
+zein
+zen
+zenbait
+zenbat
+zer
+zergatik
+ziren
+zituen
+zu
+zuek
+zuen
+zuten



Mime
View raw message