lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From va...@apache.org
Subject svn commit: r732916 [10/14] - in /lucene/pylucene/trunk: ./ java/ java/org/ java/org/osafoundation/ java/org/osafoundation/lucene/ java/org/osafoundation/lucene/analysis/ java/org/osafoundation/lucene/queryParser/ java/org/osafoundation/lucene/search/ ...
Date Fri, 09 Jan 2009 03:28:41 GMT
Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerAlternativeTest.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerAlternativeTest.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerAlternativeTest.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerAlternativeTest.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,56 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from unittest import TestCase
+from lucene import StopAnalyzer
+
+from lia.analysis.AnalyzerUtils import AnalyzerUtils
+from lia.analysis.stopanalyzer.StopAnalyzerFlawed import StopAnalyzerFlawed
+from lia.analysis.stopanalyzer.StopAnalyzer2 import StopAnalyzer2
+
+
+class StopAnalyzerAlternativeTest(TestCase):
+
+    def testStopAnalyzer2(self):
+
+        tokens = AnalyzerUtils.tokensFromAnalysis(StopAnalyzer2(),
+                                                  "The quick brown...")
+        AnalyzerUtils.assertTokensEqual(self, tokens, ["quick", "brown"])
+
+    def testStopAnalyzerFlawed(self):
+
+        tokens = AnalyzerUtils.tokensFromAnalysis(StopAnalyzerFlawed(),
+                                                  "The quick brown...")
+        self.assertEqual("the", tokens[0].termText())
+
+
+    #
+    # Illustrates that "the" is not removed, although it is lowercased
+    #
+
+    def main(cls):
+
+        AnalyzerUtils.displayTokens(StopAnalyzerFlawed(),
+                                    "The quick brown...")
+
+    main = classmethod(main)

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerAlternativeTest.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerAlternativeTest.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerFlawed.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerFlawed.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerFlawed.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerFlawed.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,49 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from lucene import \
+     LetterTokenizer, LowerCaseFilter, StopAnalyzer, StopFilter
+
+#
+# An Analyzer extension
+#
+# Stop words actually not necessarily removed due to filtering order
+#
+
+class StopAnalyzerFlawed(object):
+
+    def __init__(self, stopWords=None):
+
+        if stopWords is None:
+            self.stopWords = StopAnalyzer.ENGLISH_STOP_WORDS
+        else:
+            self.stopWords = stopWords
+
+    #
+    # Ordering mistake here
+    #
+
+    def tokenStream(self, fieldName, reader):
+
+        return LowerCaseFilter(StopFilter(LetterTokenizer(reader),
+                                          self.stopWords))

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerFlawed.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerFlawed.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerTest.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerTest.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerTest.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerTest.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,54 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from unittest import TestCase
+from lucene import StopAnalyzer
+from lia.analysis.AnalyzerUtils import AnalyzerUtils
+
+
+class StopAnalyzerTest(TestCase):
+
+    def setUp(self):
+
+        self.stopAnalyzer = StopAnalyzer()
+
+    def testHoles(self):
+        
+        expected = ["one", "enough"]
+
+        AnalyzerUtils.assertTokensEqual(self,
+                                        self.tokensFrom("one is not enough"),
+                                        expected)
+        AnalyzerUtils.assertTokensEqual(self,
+                                        self.tokensFrom("one is enough"),
+                                        expected)
+        AnalyzerUtils.assertTokensEqual(self,
+                                        self.tokensFrom("one enough"),
+                                        expected)
+        AnalyzerUtils.assertTokensEqual(self,
+                                        self.tokensFrom("one but not enough"),
+                                        expected)
+
+    def tokensFrom(self, text):
+
+        return AnalyzerUtils.tokensFromAnalysis(self.stopAnalyzer, text)

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerTest.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/StopAnalyzerTest.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/__init__.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/__init__.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/__init__.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/__init__.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1 @@
+# stopanalyzer package

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/__init__.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/stopanalyzer/__init__.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/MockSynonymEngine.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/MockSynonymEngine.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/MockSynonymEngine.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/MockSynonymEngine.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,35 @@
+# ====================================================================
+# Copyright (c) 2004-2005 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+
+class MockSynonymEngine(object):
+
+    synonyms = { "quick": ["fast", "speedy"],
+                 "jumps": ["leaps", "hops"],
+                 "over": ["above"],
+                 "lazy": ["apathetic", "sluggish"],
+                 "dogs": ["canines", "pooches"] }
+
+    def getSynonyms(self, s):
+
+        return self.synonyms.get(s, None)

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/MockSynonymEngine.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/MockSynonymEngine.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzer.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzer.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzer.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzer.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,46 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from lucene import \
+     LowerCaseFilter, StopFilter, \
+     StandardAnalyzer, StandardTokenizer, StandardFilter, PythonAnalyzer
+
+from lia.analysis.synonym.SynonymFilter import SynonymFilter
+
+#
+# An Analyzer extension
+#
+
+class SynonymAnalyzer(PythonAnalyzer):
+
+    def __init__(self, engine):
+
+        super(SynonymAnalyzer, self).__init__()
+        self.engine = engine
+
+    def tokenStream(self, fieldName, reader):
+
+        tokenStream = LowerCaseFilter(StandardFilter(StandardTokenizer(reader)))
+        tokenStream = StopFilter(tokenStream, StandardAnalyzer.STOP_WORDS)
+        
+        return SynonymFilter(tokenStream, self.engine)

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzer.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzer.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerTest.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerTest.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerTest.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerTest.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,101 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from unittest import TestCase
+
+from lucene import \
+     StandardAnalyzer, RAMDirectory, IndexWriter, Term, Document, Field, \
+     IndexSearcher, TermQuery, PhraseQuery, QueryParser
+
+from lia.analysis.AnalyzerUtils import AnalyzerUtils
+from lia.analysis.synonym.SynonymAnalyzer import SynonymAnalyzer
+from lia.analysis.synonym.MockSynonymEngine import MockSynonymEngine
+
+
+class SynonymAnalyzerTest(TestCase):
+
+    synonymAnalyzer = SynonymAnalyzer(MockSynonymEngine())
+
+    def setUp(self):
+
+        self.directory = RAMDirectory()
+        writer = IndexWriter(self.directory, self.synonymAnalyzer, True)
+
+        doc = Document()
+        doc.add(Field("content",
+                      "The quick brown fox jumps over the lazy dogs",
+                      Field.Store.YES, Field.Index.TOKENIZED))
+        writer.addDocument(doc)
+        writer.close()
+
+        self.searcher = IndexSearcher(self.directory)
+
+    def tearDown(self):
+
+        self.searcher.close()
+
+    def testJumps(self):
+
+        tokens = AnalyzerUtils.tokensFromAnalysis(self.synonymAnalyzer, "jumps")
+        AnalyzerUtils.assertTokensEqual(self, tokens,
+                                        ["jumps", "hops", "leaps"])
+
+        # ensure synonyms are in the same position as the original
+        self.assertEqual(1, tokens[0].getPositionIncrement(), "jumps")
+        self.assertEqual(0, tokens[1].getPositionIncrement(), "hops")
+        self.assertEqual(0, tokens[2].getPositionIncrement(), "leaps")
+
+    def testSearchByAPI(self):
+
+        tq = TermQuery(Term("content", "hops"))
+        hits = self.searcher.search(tq)
+        self.assertEqual(1, len(hits))
+
+        pq = PhraseQuery()
+        pq.add(Term("content", "fox"))
+        pq.add(Term("content", "hops"))
+        hits = self.searcher.search(pq)
+        self.assertEquals(1, len(hits))
+
+    def testWithQueryParser(self):
+
+        query = QueryParser("content",
+                            self.synonymAnalyzer).parse('"fox jumps"')
+        hits = self.searcher.search(query)
+        # in Lucene 1.9, position increments are no longer ignored
+        self.assertEqual(1, len(hits), "!!!! what?!")
+
+        query = QueryParser("content", StandardAnalyzer()).parse('"fox jumps"')
+        hits = self.searcher.search(query)
+        self.assertEqual(1, len(hits), "*whew*")
+
+    def main(cls):
+
+        query = QueryParser("content", cls.synonymAnalyzer).parse('"fox jumps"')
+        print "\"fox jumps\" parses to ", query.toString("content")
+
+        print "From AnalyzerUtils.tokensFromAnalysis: "
+        AnalyzerUtils.displayTokens(cls.synonymAnalyzer, "\"fox jumps\"")
+        print ''
+        
+    main = classmethod(main)

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerTest.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerTest.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerViewer.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerViewer.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerViewer.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerViewer.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,42 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+
+from lia.analysis.AnalyzerUtils import AnalyzerUtils
+from lia.analysis.synonym.WordNetSynonymEngine import WordNetSynonymEngine
+from lia.analysis.synonym.SynonymAnalyzer import SynonymAnalyzer
+
+
+class SynonymAnalyzerViewer(object):
+
+    def main(cls, argv):
+
+        engine = WordNetSynonymEngine(argv[1])
+
+        text = "The quick brown fox jumps over the lazy dogs"
+        AnalyzerUtils.displayTokensWithPositions(SynonymAnalyzer(engine), text)
+
+        text = "\"Oh, we get both kinds - country AND western!\" - B.B."
+        AnalyzerUtils.displayTokensWithPositions(SynonymAnalyzer(engine), text)
+
+    main = classmethod(main)

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerViewer.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymAnalyzerViewer.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymFilter.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymFilter.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymFilter.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymFilter.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,64 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from lucene import Token, PythonTokenFilter
+
+#
+# A TokenFilter extension
+#
+
+class SynonymFilter(PythonTokenFilter):
+
+    TOKEN_TYPE_SYNONYM = "SYNONYM"
+    
+    def __init__(self, tokenStream, engine):
+
+        super(SynonymFilter, self).__init__(tokenStream)
+
+        self.synonymStack = []
+        self.input = tokenStream
+        self.engine = engine
+
+    def next(self):
+
+        if len(self.synonymStack) > 0:
+            return self.synonymStack.pop()
+
+        # this raises StopIteration which is cleared to return null to java
+        token = self.input.next()
+        self.addAliasesToStack(token)
+
+        return token
+
+    def addAliasesToStack(self, token):
+
+        synonyms = self.engine.getSynonyms(token.termText())
+
+        if synonyms is None:
+            return
+
+        for synonym in synonyms:
+            synToken = Token(synonym, token.startOffset(), token.endOffset(),
+                             self.TOKEN_TYPE_SYNONYM)
+            synToken.setPositionIncrement(0)
+            self.synonymStack.append(synToken)

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymFilter.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/SynonymFilter.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/WordNetSynonymEngine.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/WordNetSynonymEngine.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/WordNetSynonymEngine.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/WordNetSynonymEngine.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,44 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from lucene import \
+     Document, Term, IndexSearcher, TermQuery, FSDirectory, RAMDirectory, Hit
+
+
+class WordNetSynonymEngine(object):
+
+    def __init__(self, indexDir):
+
+        self.directory = RAMDirectory(indexDir)
+        self.searcher = IndexSearcher(self.directory)
+
+    def getSynonyms(self, word):
+
+        synList = []
+
+        for hit in self.searcher.search(TermQuery(Term("word", word))):
+            doc = Hit.cast_(hit).getDocument()
+            for value in doc.getValues("syn"):
+                synList.append(value)
+
+        return synList

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/WordNetSynonymEngine.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/WordNetSynonymEngine.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/__init__.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/__init__.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/__init__.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/__init__.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1 @@
+# synonym package

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/__init__.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/analysis/synonym/__init__.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/common/LiaTestCase.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/common/LiaTestCase.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/common/LiaTestCase.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/common/LiaTestCase.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,70 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+import os
+
+from unittest import TestCase
+from lucene import FSDirectory, Document, System, SimpleDateFormat, Hit
+
+
+class LiaTestCase(TestCase):
+
+    def __init__(self, *args):
+
+        super(LiaTestCase, self).__init__(*args)
+        self.indexDir = System.getProperty("index.dir")
+
+    def setUp(self):
+
+        self.directory = FSDirectory.getDirectory(self.indexDir, False)
+
+    def tearDown(self):
+
+        self.directory.close()
+
+    #
+    # For troubleshooting
+    #
+    def dumpHits(self, hits):
+
+        if not hits:
+            print "No hits"
+        else:
+            for hit in hits:
+                hit = Hit.cast_(hit)
+                print "%s: %s" %(hit.getScore(),
+                                 hit.getDocument().get('title'))
+
+    def assertHitsIncludeTitle(self, hits, title):
+
+        for hit in hits:
+            doc = Hit.cast_(hit).getDocument()
+            if title == doc.get("title"):
+                self.assert_(True)
+                return
+
+        self.fail("title '%s' not found" %(title))
+
+    def parseDate(self, s):
+
+        return SimpleDateFormat("yyyy-MM-dd").parse(s)

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/common/LiaTestCase.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/common/LiaTestCase.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/common/TestDataDocumentHandler.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/common/TestDataDocumentHandler.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/common/TestDataDocumentHandler.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/common/TestDataDocumentHandler.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,112 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+import os
+
+from lucene import \
+    Document, Field, IndexWriter, StandardAnalyzer, DateField, \
+    SimpleDateFormat
+
+# date culled from LuceneInAction.zip archive from Manning site
+samplesModified = SimpleDateFormat('yyyy-MM-dd').parse('2004-12-02')
+
+
+class TestDataDocumentHandler(object):
+
+    def createIndex(cls, dataDir, indexDir, useCompound):
+
+        writer = IndexWriter(indexDir, StandardAnalyzer(), True)
+        writer.setUseCompoundFile(useCompound)
+
+        for dir, dirnames, filenames in os.walk(dataDir):
+            for filename in filenames:
+                if filename.endswith('.properties'):
+                    cls.indexFile(writer, os.path.join(dir, filename), dataDir)
+
+        writer.optimize()
+        writer.close()
+
+    def indexFile(cls, writer, path, baseDir):
+        
+        input = file(path)
+        props = {}
+        while True:
+            line = input.readline().strip()
+            if not line:
+                break
+            name, value = line.split('=', 1)
+            props[name] = value.decode('unicode-escape')
+        input.close()
+
+        doc = Document()
+
+        # category comes from relative path below the base directory
+        category = os.path.dirname(path)[len(baseDir):]
+        if os.path.sep != '/':
+            category = category.replace(os.path.sep, '/')
+
+        isbn = props['isbn']
+        title = props['title']
+        author = props['author']
+        url = props['url']
+        subject = props['subject']
+        pubmonth = props['pubmonth']
+
+        print title.encode('utf8')
+        print author.encode('utf-8')
+        print subject.encode('utf-8')
+        print category.encode('utf-8')
+        print "---------"
+
+        doc.add(Field("isbn", isbn,
+                      Field.Store.YES, Field.Index.UN_TOKENIZED))
+        doc.add(Field("category", category,
+                      Field.Store.YES, Field.Index.UN_TOKENIZED))
+        doc.add(Field("title", title,
+                      Field.Store.YES, Field.Index.TOKENIZED))
+
+        # split multiple authors into unique field instances
+        authors = author.split(',')
+        for a in authors:
+            doc.add(Field("author", a,
+                          Field.Store.YES, Field.Index.UN_TOKENIZED))
+
+        doc.add(Field("url", url,
+                      Field.Store.YES, Field.Index.NO))
+        doc.add(Field("subject", subject,
+                      Field.Store.NO, Field.Index.TOKENIZED,
+                      Field.TermVector.YES))
+        doc.add(Field("pubmonth", pubmonth,
+                      Field.Store.YES, Field.Index.UN_TOKENIZED))
+        doc.add(Field("contents", ' '.join([title, subject, author]),
+                      Field.Store.NO, Field.Index.TOKENIZED))
+
+        doc.add(Field("path", path,
+                      Field.Store.YES, Field.Index.UN_TOKENIZED))
+        doc.add(Field("modified", DateField.dateToString(samplesModified),
+                      Field.Store.YES, Field.Index.UN_TOKENIZED))
+
+        writer.addDocument(doc)
+
+    createIndex = classmethod(createIndex)
+    indexFile = classmethod(indexFile)

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/common/TestDataDocumentHandler.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/common/TestDataDocumentHandler.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/common/__init__.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/common/__init__.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/common/__init__.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/common/__init__.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1 @@
+# common package

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/common/__init__.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/common/__init__.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/__init__.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/__init__.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/__init__.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/__init__.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1 @@
+# extsearch package

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/__init__.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/__init__.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/MockSpecialsAccessor.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/MockSpecialsAccessor.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/MockSpecialsAccessor.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/MockSpecialsAccessor.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,32 @@
+# ====================================================================
+# Copyright (c) 2004-2005 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+class MockSpecialsAccessor(object):
+
+    def __init__(self, isbns):
+
+        self._isbns = isbns
+
+    def isbns(self):
+
+        return self._isbns

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/MockSpecialsAccessor.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/MockSpecialsAccessor.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilter.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilter.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilter.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilter.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,54 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from lucene import IndexReader, Term, BitSet, PythonFilter, JArray
+
+#
+# A Filter extension, with a TermDocs wrapper working around the lack of
+# support for returning values in array arguments.
+#
+class SpecialsFilter(PythonFilter):
+
+    def __init__(self, accessor):
+        
+        super(SpecialsFilter, self).__init__()
+        self.accessor = accessor
+
+    def bits(self, reader):
+
+        bits = BitSet(reader.maxDoc())
+        isbns = self.accessor.isbns()
+
+        for isbn in isbns:
+            if isbn is not None:
+                termDocs = reader.termDocs(Term("isbn", isbn))
+                docs = JArray(int)(1)
+                freq = JArray(int)(1)
+                if termDocs.read(docs, freq) == 1:
+                    bits.set(docs[0])
+
+        return bits
+
+    def __str__():
+
+        return "SpecialsFilter"

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilter.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilter.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilterTest.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilterTest.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilterTest.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilterTest.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,70 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from lia.common.LiaTestCase import LiaTestCase
+from lia.extsearch.filters.MockSpecialsAccessor import MockSpecialsAccessor
+from lia.extsearch.filters.SpecialsFilter import SpecialsFilter
+
+from lucene import \
+     WildcardQuery, FilteredQuery, TermQuery, BooleanQuery, RangeQuery, \
+     IndexSearcher, Term, BooleanClause
+
+
+class SpecialsFilterTest(LiaTestCase):
+
+    def setUp(self):
+
+        super(SpecialsFilterTest, self).setUp()
+
+        self.allBooks = RangeQuery(Term("pubmonth", "190001"),
+                                   Term("pubmonth", "200512"), True)
+        self.searcher = IndexSearcher(self.directory)
+
+    def testCustomFilter(self):
+
+        isbns = ["0060812451", "0465026567"]
+        accessor = MockSpecialsAccessor(isbns)
+        
+        filter = SpecialsFilter(accessor)
+        hits = self.searcher.search(self.allBooks, filter)
+        self.assertEquals(len(isbns), len(hits), "the specials")
+
+    def testFilteredQuery(self):
+        
+        isbns = ["0854402624"]  # Steiner
+
+        accessor = MockSpecialsAccessor(isbns)
+        filter = SpecialsFilter(accessor)
+
+        educationBooks = WildcardQuery(Term("category", "*education*"))
+        edBooksOnSpecial = FilteredQuery(educationBooks, filter)
+
+        logoBooks = TermQuery(Term("subject", "logo"))
+
+        logoOrEdBooks = BooleanQuery()
+        logoOrEdBooks.add(logoBooks, BooleanClause.Occur.SHOULD)
+        logoOrEdBooks.add(edBooksOnSpecial, BooleanClause.Occur.SHOULD)
+
+        hits = self.searcher.search(logoOrEdBooks)
+        print logoOrEdBooks
+        self.assertEqual(2, len(hits), "Papert and Steiner")

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilterTest.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilterTest.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/__init__.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/__init__.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/__init__.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/__init__.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1 @@
+# filters package

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/__init__.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/__init__.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/BookLinkCollector.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/BookLinkCollector.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/BookLinkCollector.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/BookLinkCollector.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,46 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from lucene import Document, IndexSearcher, PythonHitCollector
+
+#
+# A HitCollector extension
+#
+
+class BookLinkCollector(PythonHitCollector):
+
+    def __init__(self, searcher):
+        super(BookLinkCollector, self).__init__()
+        self.searcher = searcher
+        self.documents = {}
+
+    def collect(self, id, score):
+
+        doc = self.searcher.doc(id);
+        self.documents[doc["url"]] = doc["title"]
+
+        print "%s: %s" %(doc['title'], score)
+
+    def getLinks(self):
+
+        return self.documents.copy()

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/BookLinkCollector.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/BookLinkCollector.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/HitCollectorTest.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/HitCollectorTest.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/HitCollectorTest.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/HitCollectorTest.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,46 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from lia.common.LiaTestCase import LiaTestCase
+from lia.extsearch.hitcollector.BookLinkCollector import BookLinkCollector
+
+from lucene import IndexSearcher, TermQuery, Term
+
+class HitCollectorTest(LiaTestCase):
+
+    def testCollecting(self):
+
+        query = TermQuery(Term("contents", "junit"))
+        searcher = IndexSearcher(self.directory)
+
+        collector = BookLinkCollector(searcher)
+        searcher.search(query, collector)
+
+        links = collector.getLinks()
+        self.assertEqual("Java Development with Ant",
+                         links["http://www.manning.com/antbook"])
+
+        hits = searcher.search(query)
+        self.dumpHits(hits)
+
+        searcher.close()

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/HitCollectorTest.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/HitCollectorTest.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/__init__.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/__init__.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/__init__.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/__init__.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1 @@
+# hitcollector package

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/__init__.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/hitcollector/__init__.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/AdvancedQueryParserTest.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/AdvancedQueryParserTest.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/AdvancedQueryParserTest.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/AdvancedQueryParserTest.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,111 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from unittest import TestCase
+
+from lucene import \
+     WhitespaceAnalyzer, IndexSearcher, RAMDirectory, \
+     Document, Field, IndexWriter, TermQuery, SpanNearQuery
+
+from lia.extsearch.queryparser.NumberUtils import NumberUtils
+from lia.extsearch.queryparser.CustomQueryParser import \
+    MultiFieldCustomQueryParser, CustomQueryParser
+
+
+class AdvancedQueryParserTest(TestCase):
+
+    def setUp(self):
+
+        self.analyzer = WhitespaceAnalyzer()
+        self.directory = RAMDirectory()
+
+        writer = IndexWriter(self.directory, self.analyzer, True)
+
+        for i in xrange(1, 501):
+            doc = Document()
+            doc.add(Field("id", NumberUtils.pad(i),
+                          Field.Store.YES, Field.Index.UN_TOKENIZED))
+            writer.addDocument(doc)
+
+        writer.close()
+
+    def testCustomQueryParser(self):
+
+        parser = CustomQueryParser("field", self.analyzer)
+
+        try:
+            parser.parse("a?t")
+            self.fail("Wildcard queries should not be allowed")
+        except:
+            # expected
+            self.assert_(True)
+
+        try:
+            parser.parse("xunit~")
+            self.fail("Fuzzy queries should not be allowed")
+        except:
+            # expected
+            self.assert_(True)
+
+    def testCustomMultiFieldQueryParser(self):
+
+        parser = MultiFieldCustomQueryParser(["field"], self.analyzer)
+
+        try:
+            parser.parse("a?t")
+            self.fail("Wildcard queries should not be allowed")
+        except:
+            # expected
+            self.assert_(True)
+
+        try:
+            parser.parse("xunit~")
+            self.fail("Fuzzy queries should not be allowed")
+        except:
+            # expected
+            self.assert_(True)
+
+    def testIdRangeQuery(self):
+
+        parser = CustomQueryParser("field", self.analyzer)
+
+        query = parser.parse("id:[37 TO 346]")
+        self.assertEqual("id:[0000000037 TO 0000000346]",
+                         query.toString("field"), "padded")
+
+        searcher = IndexSearcher(self.directory)
+        hits = searcher.search(query)
+        self.assertEqual(310, hits.length())
+
+        print parser.parse("special:[term TO *]")
+        print parser.parse("special:[* TO term]")
+
+    def testPhraseQuery(self):
+
+        parser = CustomQueryParser("field", self.analyzer)
+
+        query = parser.parse("singleTerm")
+        self.assert_(TermQuery.instance_(query), "TermQuery")
+
+        query = parser.parse("\"a phrase\"")
+        self.assert_(SpanNearQuery.instance_(query), "SpanNearQuery")

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/AdvancedQueryParserTest.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/AdvancedQueryParserTest.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/CustomQueryParser.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/CustomQueryParser.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/CustomQueryParser.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/CustomQueryParser.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,167 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from lucene import \
+    PythonQueryParser, PythonMultiFieldQueryParser, \
+    PhraseQuery, RangeQuery, SpanNearQuery, SpanTermQuery, \
+    Term, PhraseQuery
+
+from lia.extsearch.queryparser.NumberUtils import NumberUtils
+
+#
+# A QueryParser extension
+#
+
+class CustomQueryParser(PythonQueryParser):
+
+    def __init__(self, field, analyzer):
+        super(CustomQueryParser, self).__init__(field, analyzer)
+
+    def getFuzzyQuery(self, field, termText, minSimilarity):
+        raise AssertionError, "Fuzzy queries not allowed"
+
+    def getWildcardQuery(self, field, termText):
+        raise AssertionError, "Wildcard queries not allowed"
+
+    #
+    # Special handling for the "id" field, pads each part
+    # to match how it was indexed.
+    #
+    def getRangeQuery(self, field, part1, part2, inclusive):
+
+        if field == "id":
+
+            num1 = int(part1)
+            num2 = int(part2)
+
+            return RangeQuery(Term(field, NumberUtils.pad(num1)),
+                              Term(field, NumberUtils.pad(num2)),
+                              inclusive)
+
+        if field == "special":
+            print part1, "->", part2
+
+            if part1 == '*':
+                t1 = None
+            else:
+                t1 = Term("field", part1)
+
+            if part2 == '*':
+                t2 = None
+            else:
+                t2 = Term("field", part2)
+
+            return RangeQuery(t1, t2, inclusive)
+
+        return super(CustomQueryParser,
+                     self).getRangeQuery(field, part1, part2, inclusive)
+
+    #
+    # Replace PhraseQuery with SpanNearQuery to force in-order
+    # phrase matching rather than reverse.
+    #
+    def getFieldQuery(self, field, queryText, slop=None):
+
+        if slop is None:
+            return super(CustomQueryParser,
+                         self).getFieldQuery(field, queryText)
+
+        # let QueryParser's implementation do the analysis
+        orig = super(CustomQueryParser,
+                     self).getFieldQuery(field, queryText, slop)
+
+        if not PhraseQuery.instance_(orig):
+            return orig
+
+        pq = PhraseQuery.cast_(orig)
+        clauses = [SpanTermQuery(term) for term in pq.getTerms()]
+
+        return SpanNearQuery(clauses, slop, True);
+
+
+
+class MultiFieldCustomQueryParser(PythonMultiFieldQueryParser):
+
+    def __init__(self, fields, analyzer):
+        super(MultiFieldCustomQueryParser, self).__init__(fields, analyzer)
+
+    def getFuzzyQuery(self, super, field, termText, minSimilarity):
+        raise AssertionError, "Fuzzy queries not allowed"
+
+    def getWildcardQuery(self, super, field, termText):
+        raise AssertionError, "Wildcard queries not allowed"
+
+    #
+    # Special handling for the "id" field, pads each part
+    # to match how it was indexed.
+    #
+    def getRangeQuery(self, field, part1, part2, inclusive):
+
+        if field == "id":
+
+            num1 = int(part1)
+            num2 = int(part2)
+
+            return RangeQuery(Term(field, NumberUtils.pad(num1)),
+                              Term(field, NumberUtils.pad(num2)),
+                              inclusive)
+
+        if field == "special":
+            print part1, "->", part2
+
+            if part1 == '*':
+                t1 = None
+            else:
+                t1 = Term("field", part1)
+
+            if part2 == '*':
+                t2 = None
+            else:
+                t2 = Term("field", part2)
+
+            return RangeQuery(t1, t2, inclusive)
+
+        return super(CustomQueryParser,
+                     self).getRangeQuery(field, part1, part2, inclusive)
+
+    #
+    # Replace PhraseQuery with SpanNearQuery to force in-order
+    # phrase matching rather than reverse.
+    #
+    def getFieldQuery(self, field, queryText, slop=None):
+
+        if slop is None:
+            return super(CustomQueryParser,
+                         self).getFieldQuery(field, queryText)
+
+        # let QueryParser's implementation do the analysis
+        orig = super(CustomQueryParser,
+                     self).getFieldQuery(field, queryText, slop)
+
+        if not PhraseQuery.instance_(orig):
+            return orig
+
+        pq = PhraseQuery.cast_(orig)
+        clauses = [SpanTermQuery(term) for term in pq.getTerms()]
+
+        return SpanNearQuery(clauses, slop, True);

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/CustomQueryParser.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/CustomQueryParser.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/NumberUtils.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/NumberUtils.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/NumberUtils.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/NumberUtils.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,29 @@
+# ====================================================================
+# Copyright (c) 2004-2005 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+class NumberUtils(object):
+
+    def pad(cls, n):
+        return "%0.10d" % n
+
+    pad = classmethod(pad)

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/NumberUtils.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/NumberUtils.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/__init__.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/__init__.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/__init__.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/__init__.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1 @@
+# queryparser package

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/__init__.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/queryparser/__init__.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceComparatorSource.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceComparatorSource.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceComparatorSource.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceComparatorSource.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,96 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from math import sqrt
+from lucene import SortField, Term, IndexReader, \
+    PythonSortComparatorSource, PythonScoreDocComparator, Double
+
+#
+# A SortComparatorSource implementation
+#
+
+class DistanceComparatorSource(PythonSortComparatorSource):
+
+    def __init__(self, x, y):
+        super(DistanceComparatorSource, self).__init__()
+        self.x = x
+        self.y = y
+
+    def newComparator(self, reader, fieldName):
+
+        #
+        # A ScoreDocComparator implementation
+        # 
+        class DistanceScoreDocLookupComparator(PythonScoreDocComparator):
+
+            def __init__(self, reader, fieldName, x, y):
+                super(DistanceScoreDocLookupComparator, self).__init__()
+                enumerator = reader.terms(Term(fieldName, ""))
+                self.distances = distances = [0.0] * reader.numDocs()
+
+                if reader.numDocs() > 0:
+                    termDocs = reader.termDocs()
+                    try:
+                        while True:
+                            term = enumerator.term()
+                            if term is None:
+                                raise RuntimeError, "no terms in field %s" %(fieldName)
+                            if term.field() != fieldName:
+                                break
+                            
+                            termDocs.seek(enumerator)
+                            while termDocs.next():
+                                xy = term.text().split(',')
+                                deltax = int(xy[0]) - x
+                                deltay = int(xy[1]) - y
+
+                                distances[termDocs.doc()] = sqrt(deltax ** 2 +
+                                                                 deltay ** 2)
+            
+                            if not enumerator.next():
+                                break
+                    finally:
+                        termDocs.close()
+
+            def compare(self, i, j):
+
+                if self.distances[i.doc] < self.distances[j.doc]:
+                    return -1
+                if self.distances[i.doc] > self.distances[j.doc]:
+                    return 1
+                return 0
+
+            def sortValue(self, i):
+
+                return Double(self.distances[i.doc])
+
+            def sortType(self):
+
+                return SortField.FLOAT
+
+        return DistanceScoreDocLookupComparator(reader, fieldName,
+                                                self.x, self.y)
+
+    def __str__(self):
+
+        return "Distance from ("+x+","+y+")"

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceComparatorSource.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceComparatorSource.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceSortingTest.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceSortingTest.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceSortingTest.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceSortingTest.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1,95 @@
+# ====================================================================
+# Copyright (c) 2004-2007 Open Source Applications Foundation.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions: 
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software. 
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+# ====================================================================
+#
+
+from math import sqrt
+from unittest import TestCase
+
+from lucene import \
+     WhitespaceAnalyzer, IndexSearcher, Term, TermQuery, RAMDirectory, \
+     Document, Field, IndexWriter, Sort, SortField, FieldDoc, Double
+
+from lia.extsearch.sorting.DistanceComparatorSource import \
+     DistanceComparatorSource
+
+
+class DistanceSortingTest(TestCase):
+
+    def setUp(self):
+
+        self.directory = RAMDirectory()
+        writer = IndexWriter(self.directory, WhitespaceAnalyzer(), True)
+
+        self.addPoint(writer, "El Charro", "restaurant", 1, 2)
+        self.addPoint(writer, "Cafe Poca Cosa", "restaurant", 5, 9)
+        self.addPoint(writer, "Los Betos", "restaurant", 9, 6)
+        self.addPoint(writer, "Nico's Taco Shop", "restaurant", 3, 8)
+
+        writer.close()
+
+        self.searcher = IndexSearcher(self.directory)
+        self.query = TermQuery(Term("type", "restaurant"))
+
+    def addPoint(self, writer, name, type, x, y):
+
+        doc = Document()
+        doc.add(Field("name", name, Field.Store.YES, Field.Index.UN_TOKENIZED))
+        doc.add(Field("type", type, Field.Store.YES, Field.Index.UN_TOKENIZED))
+        doc.add(Field("location", "%d,%d" %(x, y),
+                      Field.Store.YES, Field.Index.UN_TOKENIZED))
+        writer.addDocument(doc)
+
+    def testNearestRestaurantToHome(self):
+
+        sort = Sort(SortField("location", DistanceComparatorSource(0, 0)))
+
+        hits = self.searcher.search(self.query, sort)
+        self.assertEqual("El Charro", hits.doc(0).get("name"), "closest")
+        self.assertEqual("Los Betos", hits.doc(3).get("name"), "furthest")
+
+    def testNeareastRestaurantToWork(self):
+
+        sort = Sort(SortField("location", DistanceComparatorSource(10, 10)))
+
+        docs = self.searcher.search(self.query, None, 3, sort)
+        self.assertEqual(4, docs.totalHits)
+        self.assertEqual(3, len(docs.scoreDocs))
+
+        fieldDoc = FieldDoc.cast_(docs.scoreDocs[0])
+        distance = Double.cast_(fieldDoc.fields[0]).doubleValue()
+        self.assertEqual(sqrt(17), distance,
+                     "(10,10) -> (9,6) = sqrt(17)")
+
+        document = self.searcher.doc(fieldDoc.doc)
+        self.assertEqual("Los Betos", document["name"])
+
+        self.dumpDocs(sort, docs)
+
+    def dumpDocs(self, sort, docs):
+
+        print "Sorted by:", sort
+
+        for scoreDoc in docs.scoreDocs:
+            fieldDoc = FieldDoc.cast_(scoreDoc)
+            distance = Double.cast_(fieldDoc.fields[0]).doubleValue()
+            doc = self.searcher.doc(fieldDoc.doc)
+            print "  %(name)s @ (%(location)s) ->" %doc, distance

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceSortingTest.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/DistanceSortingTest.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/__init__.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/__init__.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/__init__.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/__init__.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1 @@
+# sorting package

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/__init__.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/extsearch/sorting/__init__.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/__init__.py
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/__init__.py?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/__init__.py (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/__init__.py Thu Jan  8 19:28:33 2009
@@ -0,0 +1 @@
+# handlingtypes package

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/__init__.py
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/__init__.py
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/HTML.html
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/HTML.html?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/HTML.html (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/HTML.html Thu Jan  8 19:28:33 2009
@@ -0,0 +1,11 @@
+<html>
+  <head>
+    <title>
+      Laptop power supplies are available in First Class only
+    </title>
+  </head>
+  <body>
+    <h1>Code, Write, Fly</h1>
+    This chapter is being written 11,000 meters above New Foundland. 
+  </body>
+</html>

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/HTML.html
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/HTML.html
------------------------------------------------------------------------------
    svn:mime-type = text/html

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/MSWord.doc
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/MSWord.doc?rev=732916&view=auto
==============================================================================
Binary file - no diff available.

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/MSWord.doc
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/PDF.pdf
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/PDF.pdf?rev=732916&view=auto
==============================================================================
Binary file - no diff available.

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/PDF.pdf
------------------------------------------------------------------------------
    svn:mime-type = application/pdf

Added: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/PlainText.txt
URL: http://svn.apache.org/viewvc/lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/PlainText.txt?rev=732916&view=auto
==============================================================================
--- lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/PlainText.txt (added)
+++ lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/PlainText.txt Thu Jan  8 19:28:33 2009
@@ -0,0 +1 @@
+This is the content of the Plain Text document
\ No newline at end of file

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/PlainText.txt
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/pylucene/trunk/samples/LuceneInAction/lia/handlingtypes/data/PlainText.txt
------------------------------------------------------------------------------
    svn:mime-type = text/plain



Mime
View raw message