lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "JAYABAALAN V (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality
Date Sat, 09 Oct 2010 08:00:30 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919469#action_12919469
] 

JAYABAALAN V edited comment on SOLR-2010 at 10/9/10 4:00 AM:
-------------------------------------------------------------

Thanks for your direction.

Based on your input i have tried in the truck and used the SOLR-2010_shardRecombineCollations_999521.patch
for download.

http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/common/org/apache/solr/common/params/SpellingParams.java.this
path 

But there is a problem in the SpellingParams.java under this version.It looks not updated
correctly in this version.Mainly three final string values like ""maxCollations","maxCollationTries",
and collateExtendedResults are implemented and it  Solr v1.3 in the history.

Do let me know the updated version path for downloading.

      was (Author: vjayabaalan):
    Thanks for your direction.

Based on your input i have tried in the truck and used the SOLR-2010_shardRecombineCollations_999521.patch
for download.

http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/common/org/apache/solr/common/params/SpellingParams.java.this
path 

But there is no problem in the SpellingParams.java under this version.It looks not updated
.Mainly three final string values like ""maxCollations","maxCollationTries", and collateExtendedResults
are implemented and it looks Solr v1.3 in the history.

Do let me know the updated version path for downloading.
  
> Improvements to SpellCheckComponent Collate functionality
> ---------------------------------------------------------
>
>                 Key: SOLR-2010
>                 URL: https://issues.apache.org/jira/browse/SOLR-2010
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java, spellchecker
>    Affects Versions: 1.4.1
>         Environment: Tested against trunk revision 966633
>            Reporter: James Dyer
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: SOLR-2010.patch, SOLR-2010.patch, SOLR-2010.patch, SOLR-2010.patch,
SOLR-2010.txt, SOLR-2010_141.patch, SOLR-2010_shardRecombineCollations_993538.patch, SOLR-2010_shardRecombineCollations_999521.patch,
SOLR-2010_shardSearchHandler_993538.patch, SOLR-2010_shardSearchHandler_999521.patch
>
>
> Improvements to SpellCheckComponent Collate functionality
> Our project requires a better Spell Check Collator.  I'm contributing this as a patch
to get suggestions for improvements and in case there is a broader need for these features.
> 1. Only return collations that are guaranteed to result in hits if re-queried (applying
original fq params also).  This is especially helpful when there is more than one correction
per query.  The 1.4 behavior does not verify that a particular combination will actually return
hits.
> 2. Provide the option to get multiple collation suggestions
> 3. Provide extended collation results including the # of hits re-querying will return
and a breakdown of each misspelled word and its correction.
> This patch is similar to what is described in SOLR-507 item #1.  Also, this patch provides
a viable workaround for the problem discussed in SOLR-1074.  A dictionary could be created
that combines the terms from the multiple fields.  The collator then would prune out any spurious
suggestions this would cause.
> This patch adds the following spellcheck parameters:
> 1. spellcheck.maxCollationTries - maximum # of collation possibilities to try before
giving up.  Lower values ensure better performance.  Higher values may be necessary to find
a collation that can return results.  Default is 0, which maintains backwards-compatible behavior
(do not check collations).
> 2. spellcheck.maxCollations - maximum # of collations to return.  Default is 1, which
maintains backwards-compatible behavior.
> 3. spellcheck.collateExtendedResult - if true, returns an expanded response format detailing
collations found.  default is false, which maintains backwards-compatible behavior.  When
true, output is like this (in context):
> <lst name="spellcheck">
> 	<lst name="suggestions">
> 		<lst name="hopq">
> 			<int name="numFound">94</int>
> 			<int name="startOffset">7</int>
> 			<int name="endOffset">11</int>
> 			<arr name="suggestion">
> 				<str>hope</str>
> 				<str>how</str>
> 				<str>hope</str>
> 				<str>chops</str>
> 				<str>hoped</str>
> 				etc
> 			</arr>
> 		<lst name="faill">
> 			<int name="numFound">100</int>
> 			<int name="startOffset">16</int>
> 			<int name="endOffset">21</int>
> 			<arr name="suggestion">
> 				<str>fall</str>
> 				<str>fails</str>
> 				<str>fail</str>
> 				<str>fill</str>
> 				<str>faith</str>
> 				<str>all</str>
> 				etc
> 			</arr>
> 		</lst>
> 		<lst name="collation">
> 			<str name="collationQuery">Title:(how AND fails)</str>
> 			<int name="hits">2</int>
> 			<lst name="misspellingsAndCorrections">
> 				<str name="hopq">how</str>
> 				<str name="faill">fails</str>
> 			</lst>
> 		</lst>
> 		<lst name="collation">
> 			<str name="collationQuery">Title:(hope AND faith)</str>
> 			<int name="hits">2</int>
> 			<lst name="misspellingsAndCorrections">
> 				<str name="hopq">hope</str>
> 				<str name="faill">faith</str>
> 			</lst>
> 		</lst>
> 		<lst name="collation">
> 			<str name="collationQuery">Title:(chops AND all)</str>
> 			<int name="hits">1</int>
> 			<lst name="misspellingsAndCorrections">
> 				<str name="hopq">chops</str>
> 				<str name="faill">all</str>
> 			</lst>
> 		</lst>
> 	</lst>
> </lst>
> In addition, SOLRJ is updated to include SpellCheckResponse.getCollatedResults(), which
will return the expanded Collation format.  getCollatedResult(), which returns a single String,
is retained for backwards-compatibility.  Other APIs were not changed but will still work
provided that spellcheck.collateExtendedResult is false.
> This likely will not return valid results if using Shards.  Rather, a more robust interaction
with the index would be necessary than what exists in SpellCheckCollator.collate().

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message