lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Commented: (SOLR-1057) PathTokenizerFactory
Date Wed, 02 Feb 2011 16:15:33 GMT


Robert Muir commented on SOLR-1057:

I'm a little confused about the use of the tokenizer (i have no problems technically, its
maybe a naming issue?)

Is this intended for tokenizing file pathnames as its name would suggest? In this case I think
the path should have positions, e.g. /foo/bar/whatever.txt is foo(1), bar(1), whatever.txt(1)?

It seems instead, this one is intended for representing hierarchies, as it creates synonyms
of /foo, /foo/bar, /foo/bar/whatever.txt... with position increments of zero.

I guess I'm just being picky about naming, but i think this hierarchical case is more specific
than 'tokenizing file pathnames' and maybe a name like HierarchyTokenizer (this one too probably
isn't the best!) would better represent what it does?

> PathTokenizerFactory
> --------------------
>                 Key: SOLR-1057
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>         Attachments: SOLR-1057-PathTokenizerFactory.patch, SOLR-1057-PathTokenizerFactory.patch,
> This is a Tokenizer that splits the input string into a series of paths.  For example:
> {panel}
>  /aaa/bbb/ccc
> {panel}
> becomes:
> {panel}
>  /aaa/
>  /aaa/bbb/
>  /aaa/bbb/ccc
> {panel}

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message