commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Carman <jcar...@carmanconsulting.com>
Subject Re: [lang] Longest common substring / Suffix Tree
Date Tue, 13 Mar 2012 23:36:10 GMT
It would have come in handy for me while doing bioinformatics work in the
past.  But, you're right, they have very cool tools out there.  I was
amazed at some of the stuff they can do.
On Mar 13, 2012 7:08 PM, "Thomas Neidhart" <thomas.neidhart@gmail.com>
wrote:

> On 03/13/2012 08:55 AM, Luc Maisonobe wrote:
> > Le 13/03/2012 00:53, James Carman a écrit :
> >> A lot of bioinformaticians would love us if we added this!
>
> I picked this topic up as I find it interesting to myself and it would
> be a useful addition for many other people too I guess, but from what I
> have seen so far, bioinformaticians wouldn't be necessarily impressed by
> that ;-). Afaik they have pretty good tools, and there exist special
> algorithms to compute suffix trees for really large strings in clusters
> or on disk as they wont fit in memory anymore.
>
> > In the same spirit, I know an implementation of the Myers difference
> > algorithm that runs on any object implementing equals and also provides
> > an API for browsing the "edit script" resulting from the comparison.
> > This allows for example to retrieve only the shared elements, or only
> > the ones in the first or the second sequence, or "running" the script,
> > or whatever.
> >
> > If you consider this could be a good addition to [lang] or another
> > component ([graph] ?) I can ask for a grant for this.
>
> this would be a perfect companion for the longest common substring
> problem, the o.a.c.l.text package looks like a good fit for these things
> imho.
>
> Thomas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message