lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From José Tomás Atria <jtat...@gmail.com>
Subject Re: Persistence/Serialization of Automaton
Date Thu, 24 Mar 2016 17:20:46 GMT
Hi Mike,

Thanks for your reply. I was assuming what you mention about automata being
just a couple of int arrays, so I went and looked at the code for
Automaton.copy( Automaton other ), and that is in fact what the code copies
from the other Automaton:
int[] states
int[] transitions

But I got confused, because the copying code makes references to something
that looks like state variables in the source object:
int nextState
int nextTransition

So I'm not sure if it's possible to, for example, reconstruct an automaton
merely from the states and transitions int[], or if I also need to pay
attention to the nextState and nextTransition values, that I have no idea
what they are, or if they are immutable, etc. I have been using factory
methods to construct all of my automata from strings, so I don't understand
what this states mean, and whether they are relevant for the automaton's
_definition_ per opposed to their construction or execution.

Thanks!
jta



On Thu, Mar 24, 2016 at 12:54 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Lucene no longer has Serializable on its classes: the
> cross-java-version implications are too difficult.  So we expect/rely
> on the user layer above Lucene to handle any serialization needs.
>
> That said, serializing an automaton should be quite simple since the
> data structure is just int node IDs, marked as accept nodes or not,
> with connecting transitions that have min/max labels.  You could write
> that to your own byte stream and re-build the automaton on
> deserializing.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Thu, Mar 24, 2016 at 12:08 PM, Erick Erickson
> <erickerickson@gmail.com> wrote:
> > I'm really out of my league here, but some of the suggester stuff
> > builds an image on disk and some of the implementations use FSTs,
> > which are at least in the ballpark.
> >
> > What I'm saying here is that the code may already be in place, or at
> > least a place to start.
> >
> > And I have to ask, "why do you want to do this in the first place?".
> > What is the problem you're trying to solve anyway?
> >
> > Best,
> > Erick
> >
> > On Thu, Mar 24, 2016 at 6:57 AM, McKinley, James T
> > <james.mckinley@cengage.com> wrote:
> >> Here's an archive link from this mailing list regarding serializing
> queries, I guess this would work for Automaton objects as well.
> >>
> >>
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/201603.mbox/browser
> >>
> >> Hope it helps.
> >>
> >> Jim
> >> ________________________________________
> >> From: José Tomás Atria <jtatria@gmail.com>
> >> Sent: 23 March 2016 19:09
> >> To: java-user@lucene.apache.org
> >> Subject: Persistence/Serialization of Automaton
> >>
> >> Hello!
> >>
> >> Is it possible to serialize Lucene's Automata? I see that the javadoc
> for
> >> the original BRICS package indicates that instances of Automaton
> implement
> >> Serialzable, but this is not the case with the Automaton class in
> Lucene 5+.
> >>
> >> I assume it is possible, considering that a FSA is basically just a set
> of
> >> states and transitions, but how would I go about (1) extracting that
> data
> >> from an instance of automaton and (2) recreating the original automaton
> >> given a set of transitions and states as it would be possible to obtain
> >> them from a live instance?
> >>
> >> Alternatively, maybe there is some other place where this is
> implemented?
> >> How can I persist lucene's automata?
> >>
> >> thanks,
> >> jta
> >>
> >> --
> >> entia non sunt multiplicanda praeter necessitatem
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
entia non sunt multiplicanda praeter necessitatem

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message