lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-2800) optimize RemoveDuplicatesTokenFilterFactory
Date Sat, 08 Sep 2012 02:59:07 GMT

     [ https://issues.apache.org/jira/browse/SOLR-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Muir updated SOLR-2800:
------------------------------

    Issue Type: Improvement  (was: Bug)
       Summary: optimize RemoveDuplicatesTokenFilterFactory  (was: RemoveDuplicatesTokenFilterFactory
can not remove the duplicated term)
    
> optimize RemoveDuplicatesTokenFilterFactory
> -------------------------------------------
>
>                 Key: SOLR-2800
>                 URL: https://issues.apache.org/jira/browse/SOLR-2800
>             Project: Solr
>          Issue Type: Improvement
>          Components: Schema and Analysis
>    Affects Versions: 3.4
>         Environment: Windows
>            Reporter: Han Hui Wen 
>            Assignee: Robert Muir
>              Labels: RemoveDuplicatesTokenFilterFactory, Solr
>
> Using RemoveDuplicatesTokenFilterFactory can not remove the duplicated term.
> in http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_3_4/solr/core/src/java/org/apache/solr/analysis/RemoveDuplicatesTokenFilter.java?view=markup
> @Override
> 53 	public boolean incrementToken() throws IOException {
> 54 	while (input.incrementToken()) {
> 55 	final char term[] = termAttribute.buffer();
> 56 	final int length = termAttribute.length();
> 57 	final int posIncrement = posIncAttribute.getPositionIncrement();
> 58 	
> 59 	if (posIncrement > 0) {
> 60 	previous.clear();
> 61 	}
> 62 	
> 63 	boolean duplicate = (posIncrement == 0 && previous.contains(term, 0, length));
> 64 	
> 65 	// clone the term, and add to the set of seen terms.
> 66 	char saved[] = new char[length];
> 67 	System.arraycopy(term, 0, saved, 0, length);
> 68 	previous.add(saved);
> 69 	
> 70 	if (!duplicate) {
> 71 	return true;
> 72 	}
> 73 	}
> 74 	return false;
> 75 	}
> it should be like following:
> @Override
> public boolean incrementToken() throws IOException {
> 	while (input.incrementToken()) {
> 		final char term[] = termAttribute.buffer();
> 		final int length = termAttribute.length();
> 		final int posIncrement = posIncAttribute.getPositionIncrement();
> 		if (posIncrement > 0) {
> 			previous.clear();
> 		}
> 		boolean duplicate = (posIncrement == 0 && previous.contains(term, 0, length));
> 		 
> 		if(duplicate )
> 		{
> 		  return false;
> 		}
> 		else
> 		{
> 			// clone the term, and add to the set of seen terms.
> 			char saved[] = new char[length];
> 			System.arraycopy(term, 0, saved, 0, length);
> 			previous.add(saved);
> 		}
> 	}
> 	return true;
> }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message