lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cristian Lorenzetto <cristian.lorenze...@gmail.com>
Subject Re: Binary Automaton
Date Sat, 30 Sep 2017 14:33:50 GMT
*to @Uwe Schindler *

thanks , it is very interesting :)

*to @Dawid*

 Preface: I dont know how automaton is implemented deeply inside lucene ,
but (considering automaton is built on the fly when index is already
present) i imagine that the automaton   is scanning the lexicons/tokens
present in the lucene index for finding the document references (solution
1).
I think there are 2 different generic solutions for using automata for my
opinion.
1) to create a automaton for parsing the token present in the lucene table
as described above.
2) to create a pattern matching automaton(on binary, or better of a
abstract stream could be  more generic) and put these states directly in a
index . In this case you can receive very fastly the documents matching a
specific automaton built when you created the index ( or a sub-automaton
 rappreenting a subset of the same states) . The second solution could
maybe be used for mapping inside a single lucene document field a complex
structure  and then you can find nested information embedded . In this way
i need not to use multiple lucene documents (this could create performance
and scalability problems)
In many cases this solution could be fastest of actual joins for example,
 be usefull in bioinformatic or all those cases where data is not a basic
 ADT.

Cristian

2017-09-30 12:24 GMT+02:00 Dawid Weiss <dawid.weiss@gmail.com>:

> > Hi , it is possible to create a Automaton in lucene parsing not a string
> > but a byte array?
>
> Can you state what problem are you trying to solve? This seems to be a
> question stripped of a more general context -- why do you need those
> byte-based automata?
>
> Dawid
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message