Return-Path: X-Original-To: apmail-commons-issues-archive@minotaur.apache.org Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3E10FA69 for ; Mon, 29 Apr 2013 19:38:17 +0000 (UTC) Received: (qmail 89436 invoked by uid 500); 29 Apr 2013 19:38:17 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 89308 invoked by uid 500); 29 Apr 2013 19:38:17 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 89237 invoked by uid 99); 29 Apr 2013 19:38:17 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Apr 2013 19:38:17 +0000 Date: Mon, 29 Apr 2013 19:38:16 +0000 (UTC) From: "Thomas Neidhart (JIRA)" To: issues@commons.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (COLLECTIONS-404) Adding an implementation of Eugene Myers difference algorithm MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/COLLECTIONS-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644782#comment-13644782 ] Thomas Neidhart commented on COLLECTIONS-404: --------------------------------------------- Moved the package in r1477287. Additionally, as a best practice in commons, made the object in EditCommand private and added a getter. For the Commands, I am now unsure if the refactoring really makes sense. We could change the append methods in EditScript to be similar to the Visitor (e.g. appendInsertCommand, appendKeepCommand, ...) and thus completely hiding this implementation detail in the EditScript (which is a good thing in commons due to the strict API rules). Otoh the current API is also good OO design, so I am inclined to keep it as is. My original idea was to do merging of commands (e.g. the EditScript would check if the last command was the same as the current and then merge them, each command would have a list of T instead of a single T), to save memory as we do not need to instantiate a new command for a sequence of equal commands (can be an issue for large sequences). But the trade-off would be to create a List for each command, so the gain may not be as great as originally thought. > Adding an implementation of Eugene Myers difference algorithm > ------------------------------------------------------------- > > Key: COLLECTIONS-404 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-404 > Project: Commons Collections > Issue Type: Improvement > Components: Collection > Affects Versions: 3.2.1 > Environment: all > Reporter: Luc Maisonobe > Priority: Minor > Fix For: 4.0 > > Attachments: commons-collections-difference.patch, commons-collections-difference-v2.patch, comparator.zip, DiffTest.java > > > The difference algorithm aims at comparing two sequences of objects and return an "edit script" which represents how one can transform the first sequence into the second sequence. The script describes the various insert object, delete object and keep object commands. The script is guaranteed to be the shortest possible in terms of number of commands. > From the script, one can either extract longest common sub-sequences (i.e. how similar the sequences are) or on the contrary the needed changes (i.e. how different the sequences are). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira