Package nzilbb.editpath
Class MinimumEditPath<T>
- java.lang.Object
-
- nzilbb.editpath.MinimumEditPath<T>
-
- Direct Known Subclasses:
MinimumEditPathString
public class MinimumEditPath<T> extends Object
Implementation of the Wagner-Fischer algorithm to determine the minimum edit path (or distance) between two sequences. While traditionally this is between two strings (sequences of characters), the implementation uses templates to support sequences of any type, and interfaces for comparators, to support different settings and methods for determining edit distance.A traditional Levenstein distance calculation can be achieved with MinimumEditPath<Character>.
e.g. The edit path between two lists of Integers can be determined using:
MinimumEditPath<Integer> mp = new MinimumEditPath<Integer>(); List<EditStep<Integer>> path = mp.minimumEditPath(vFrom, vTo) for (EditStep<Integer> step: path) { System.out.println("from " + step.getFrom() + " to " + step.getTo() + " : " + step.getOperation() + " distance " + step.getStepDistance()); }
The equality comparison can be customized, e.g.:
MinimumEditPath<String> mp = new MinimumEditPath<String>( new DefaultEditComparator<String>(new EqualsComparator<String>() { public int compare(String o1, String o2) { return o1.toLowerCase().compareTo(o2.toLowerCase()); } }));
- Author:
- Robert Fromont robert@fromont.net.nz
-
-
Constructor Summary
Constructors Constructor Description MinimumEditPath()
Default constructorMinimumEditPath(EditComparator<T> comparator)
Constructor
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description List<EditStep<T>>
collapse(List<EditStep<T>> path)
Collapses edit path so that subsequent delete/create steps are collapsed into a single change step, except those that would result in a change more than 3 times more costly than a delete and insert.List<EditStep<T>>
collapse(List<EditStep<T>> path, boolean ifChangeIsReasonable)
Collapses edit path so that subsequent delete/create steps are collapsed into a single change step.double
errorRate(List<EditStep<T>> path)
Return the error rate for the given path.EditComparator<T>
getComparator()
Getter forcomparator
: The comparator used when determining the distance between two sequence elements.double
minimumEditDistance(List<T> from, List<T> to)
Computes the minimum edit distance between two sequences.List<EditStep<T>>
minimumEditPath(List<T> from, List<T> to)
Computes the minimum path from one sequence to another.MinimumEditPath<T>
setComparator(EditComparator<T> Newcomparator)
Setter forcomparator
: The comparator used when determining the distance between two sequence elements.
-
-
-
Constructor Detail
-
MinimumEditPath
public MinimumEditPath()
Default constructor
-
MinimumEditPath
public MinimumEditPath(EditComparator<T> comparator)
Constructor- Parameters:
comparator
- Element comparator to use.
-
-
Method Detail
-
getComparator
public EditComparator<T> getComparator()
Getter forcomparator
: The comparator used when determining the distance between two sequence elements.- Returns:
- The comparator used when determining the distance between two sequence elements.
-
setComparator
public MinimumEditPath<T> setComparator(EditComparator<T> Newcomparator)
Setter forcomparator
: The comparator used when determining the distance between two sequence elements.- Parameters:
Newcomparator
- The comparator used when determining the distance between two sequence elements.
-
minimumEditPath
public List<EditStep<T>> minimumEditPath(List<T> from, List<T> to)
Computes the minimum path from one sequence to another.- Parameters:
from
- The source (original) sequence.to
- The destination (final) sequence.- Returns:
- The edit path between the two sequences that has the minimum edit distance.
-
minimumEditDistance
public double minimumEditDistance(List<T> from, List<T> to)
Computes the minimum edit distance between two sequences.- Parameters:
from
- The source (original) sequence.to
- The destination (final) sequence.- Returns:
- The minimum edit distance between the two sequences.
-
collapse
public List<EditStep<T>> collapse(List<EditStep<T>> path)
Collapses edit path so that subsequent delete/create steps are collapsed into a single change step, except those that would result in a change more than 3 times more costly than a delete and insert.- Parameters:
path
- The path to collapse- Returns:
- The given path, with possibly fewer steps.
-
collapse
public List<EditStep<T>> collapse(List<EditStep<T>> path, boolean ifChangeIsReasonable)
Collapses edit path so that subsequent delete/create steps are collapsed into a single change step.- Parameters:
path
- The path to collapseifChangeIsReasonable
- If true, then any collapsing that would result in a change more than 3 times more costly than a delete and insert will not be collapsed. If false, all insert/delete pairs will be collapsed regardless of cost.- Returns:
- The given path, with possibly fewer steps.
-
-