lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Antony Bowesman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1150) The token types of the standard tokenizer is not accessible
Date Tue, 15 Apr 2008 07:35:12 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588953#action_12588953
] 

Antony Bowesman commented on LUCENE-1150:
-----------------------------------------

The original tokenImage String array from 2.2 is still not available in this patch, they are
still in the Impl.  These are the values returned from Token.type(), so should they not be
visible as well as the static ints?


> The token types of the standard tokenizer is not accessible
> -----------------------------------------------------------
>
>                 Key: LUCENE-1150
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1150
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>    Affects Versions: 2.3
>            Reporter: Nicolas Lalevée
>            Assignee: Michael McCandless
>             Fix For: 2.3.2, 2.4
>
>         Attachments: LUCENE-1150.patch, LUCENE-1150.take2.patch
>
>
> The StandardTokenizerImpl not being public, these token types are not accessible :
> {code:java}
> public static final int ALPHANUM          = 0;
> public static final int APOSTROPHE        = 1;
> public static final int ACRONYM           = 2;
> public static final int COMPANY           = 3;
> public static final int EMAIL             = 4;
> public static final int HOST              = 5;
> public static final int NUM               = 6;
> public static final int CJ                = 7;
> /**
>  * @deprecated this solves a bug where HOSTs that end with '.' are identified
>  *             as ACRONYMs. It is deprecated and will be removed in the next
>  *             release.
>  */
> public static final int ACRONYM_DEP       = 8;
> public static final String [] TOKEN_TYPES = new String [] {
>     "<ALPHANUM>",
>     "<APOSTROPHE>",
>     "<ACRONYM>",
>     "<COMPANY>",
>     "<EMAIL>",
>     "<HOST>",
>     "<NUM>",
>     "<CJ>",
>     "<ACRONYM_DEP>"
> };
> {code}
> So no custom TokenFilter can be based of the token type. Actually even the StandardFilter
cannot be writen outside the org.apache.lucene.analysis.standard package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message