lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dawid Weiss (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3206) FST package API refactoring
Date Thu, 16 Jun 2011 11:01:48 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050347#comment-13050347
] 

Dawid Weiss commented on LUCENE-3206:
-------------------------------------

This is my take at the revamped FST API. My changes are mostly aiming at having a bit clearer
code (especially wrt. to loops), but also detach the "algebra" of a transition's output from
the actual output. This should allow us to create an output algebra that would work directly
on mutable integers, for example (to save on autoboxing). I also just like the way it reads
after the changes:
{code}
      FST<Integer> fst = FSTBuilder.fst(FST.ArcLabel.BYTE2, PositiveInt.class)
        .add("abc", 10)
        .add("abc, 5)
        .add("def", 0, 3), 2)
        .build();
{code}
or a loop over all arcs of a state:
{code}
      Arc<Integer> arc = fst.getRoot();
      for (Arc<Integer> tmp = arc.copy(); tmp.hasNext(); tmp.next()) {
        int label = tmp.getLabel();     // transition label here.
        Integer output = tmp.getOutput(); // FSAs have a constant empty output.
      }
{code}

I definitely didn't consider all the use cases that FSTs are used for currently (in particular
the "stop" bit indicating non-accepted input sequences that are also dead ends), but I think
these could be integrated... I think :) 

Arcs now also store the pointer to the FST object, which may seem like an overhead, but I
doubt it really will be (it's a single pointer and we buffer arcs whenever we can; a larger
waste is having an object on each arc's output, even if it can be a primitive type or reused
buffer).




> FST package API refactoring
> ---------------------------
>
>                 Key: LUCENE-3206
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3206
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/FSTs
>    Affects Versions: 3.2
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>             Fix For: 3.3, 4.0
>
>         Attachments: LUCENE-3206.patch
>
>
> The current API is still marked @experimental, so I think there's still time to fiddle
with it. I've been using the current API for some time and I do have some ideas for improvement.
This is a placeholder for these -- I'll post a patch once I have a working proof of concept.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message