DP is Simple: Edit Distance

Problem Statement

Given two strings find the minimum number of edits that should be done to transform one word after another. The actions can be:
  • Replace
  • Create
  • Delete
For eg. If you've Game and Gate. You can perform one Replace operation to transform Game to Gate.

How to think about a solution ?

Lay down the three operations:
  • Game => Gate (Replace)
  • Hat => at (Delete)
  • Love => Loves (Insert)
You can say that at each character we need to make a decision at each point. Looks like a Tree problem, using recursion.

Here's a rough idea:

  • It'll be a recursive call.
  • At each point we need to choose: Insert, Delete and Replace.
  • Compute the length.
  • Return the minimum length in the recursion.
  • That's it!
Here is the code for it:

Since we're making 3 calls for each character. This can go to O(3^N) where N is the length of the string. Can we optimise it ?

The first thing to check here is that if we can find problems. Since we're going top to bottom for each decision we're computing the same thing down the tree.

Let's take a simple example

Transform ea -> eb

Let's create a matrix to represent operation at each character:
   e a
e 0 1
b 1 1

Here's how the matrix is created:

- [0, 0]:  e -> e
Takes no operation so the value is 0.

--------------------------------------------------------------------------

- [0, 1]: e -> ea
Takes one delete operation so the value is 1.

--------------------------------------------------------------------------

- [1, 0]: eb -> e
Takes one delete operation so the value is 1.

--------------------------------------------------------------------------

- [1, 1]: eb -> ea
In this step we'll use the precomputed values. Let's take them one by one:

if we take [0, 0], we'll have 1 operation.
  1. No operation for e -> e.
  2. Then replace for b -> a.
if we take [1, 0], we'll have 2 operations.
  1. Delete operation for ea -> e.
  2. Add operation for e -> eb.

if we take [1, 0], we'll have 2 operations.
  1. Delete operation for eb -> e.
  2. Add operation for e -> ea.
Since we're calculating the minimum distance, we'll take the one with minimum edits. So out of the three operations we'll choose first since it takes 1 operation.

Here's another example:

   g e s e k
g 0 1 2 3 4
e 1 0 1 2 3
e 2 1 2 1 2
k 3 2 3 2 1

Here's the DP implementation:

Comments