Dynamic Programming - Optimal Path in Matrix

We are given an nxn matrix. The elements in the matrix are random, positive integers. We want to go from the top left corner to the bottom right by moving only right and down (no left, up or diagonal moves). Our goal is to move using the minimum cost path.

Let our matrix be the following:

5   3   10  17  1
4   2   9   8   5
11  12  3   9   6
1   3   4   2   10
7   11  13  7   3

Approach

To solve this problem we will use a Dynamic Programming approach.

Remember, we can only make two moves. That helps a lot in the design of the algorithm. Because we know where the optimal solution will come from. To reach a node we will do so either from its left or its top. Naturally the optimal solution follows this pattern.

Generally, for an element X, the optimal way to get to it from the top left passes through either its left or its top. To reach its left and top, we need paths of costs a and b. We pick the smallest of the two and we have the optimal solution to reach X.

So, to get to the bottom right corner, we need to pick either the left or top cost. To compute those values we need to know their left and top and so on. The base case is the top left element, where we need Matrix(0,0) to get to (as we’re starting from there).

So, the formula to find the path cost to get to the element (i,j) is this:

Path(i, j) = min{ Path(i-1, j), Path(i, j-1) } + Matrix(i, j)

We pick the minimum of the left (i,j-1) and top (i-1,j) costs and we add the value of the element. That’s the cost of the path to get to (i,j).

As the matrix has length of n, the cost to get to the bottom right is Path(n, n).

Implementation

(Using Python)

First we need our matrix. You can receive the matrix as you wish as long as it is nxn; reading it from a file, reading it from the user’s input, or any way you wish. Here we will manually pass the values in the source code.

matrix = [[5, 3, 10, 17, 1],
          [4, 2, 9, 8, 5],
          [11, 12, 3, 9, 6],
          [1, 3, 4, 2, 10],
          [7, 11, 13, 7, 3]];

We also need to initialize the array that will hold the solutions. This array needs to be the same dimensions as matrix. As matrix is square, we only need to get one of its dimensions. We can write:

n = len(matrix);

Our array to hold the solutions will be called solutions and will be an nxn array. We initialize it like this (all its values start from 0):

solutions = [[0 for i in range(n)] for j in range(n)];

For our base case, the first element of the matrix (0,0) will have a solution of matrix[0][0] (as we are starting from there).

solutions[0][0] = matrix[0][0];

Now we need to take a step back and think about the problem at hand. Is there any way we can make it easier on ourselves? Remember our equation:

Path(i, j) = min{ Path(i-1, j), Path(i, j-1) } + Matrix(i, j)

Think of what will happen on the first collumn and the first row of the matrix. These border cases pose a problem, as the index in Path will get out of range. In the case of the first collumn, j = 0 and j-1 = -1, therefore Path(i,j-1) goes out of range. Likewise for the first row.

Even though this looks like an obstacle, we can actually use it to our advantage. Because we already know what the solution to these border cases is. We take out the out-of-range Path and we use the rest of the equation. For the first collumn, the solution therefore becomes:

Path(i, j) = Path(i-1, j) + Matrix(i, j)

And for the first row it is:

Path(i, j) = Path(i, j-1) + Matrix(i, j)

In simple terms, the solution of the first collumn elements is their value plus the solution of the above element. For the first row, the solution of an element is its value plus the solution of the left element. The base case in both is the first element of the array which has a solution of matrix[0][0]. In code:

for i in range(1, n):
    solutions[i][0] = solutions[i-1][0] + matrix[i][0];

for j in range(1, n):
    solutions[0][j] = solutions[0][j-1] + matrix[0][j];

Now we can work on the main body of the algorithm, the computation of the solution. We will use a recursive function for this. We will call this function CalculatePath. Keep in mind that the recursion will not reach the border cases (first row and first collumn) so we do not need to check range. The index for the row is i and the collumn is j. For example, i=3, j=5 means the 5th element on the 3rd row.

Let (i,j) be the current indexes.

The main idea behind Dynamic Programming is the use of previously computed solutions in the computation of the current solution. We will use this technique here too. In this implementation, previous solutions are stored in the array solutions. We have initialized this array at 0, so every value that is larger than that signifies a solution has been computed. So, every time we reach a solution that is larger than 0, we will return it as is. In code:

if(solutions[i][j] > 0):
    #We have already calculated solution for i,j; return it.
    return solutions[i][j];

If we have not computed a solution we can use, we need to compute it. We do it by following the general formula we came up with earlier. We need to find the optimal solution for the element on top and for the element on the left of the current element.

#Optimal solution for i-1,j (top)
topCost = CalculatePath(i-1, j) + matrix[i][j];

#Optimal solution for i,j-1 (left)
leftCost = CalculatePath(i, j-1) + matrix[i][j];

We pick the minimum of the two and we add to it the value of the current element, matrix[i][j]. That is the optimal solution for (i,j). We store it at solutions.

if(topCost < leftCost):
    solutions[i][j] = topCost;
    return topCost;
else:
    solutions[i][j] = leftCost;
    return leftCost;

Adding them all together, our recursive function is this:

def CalculatePath(i, j):
    if(s[i][j] > 0):
        #We have already calculated solution for i,j; return it.
        return s[i][j];

    #Optimal solution for i-1,j (top)
    topCost = CalculatePath(i-1, j) + matrix[i][j];

    #Optimal solution for i,j-1 (left)
    leftCost = CalculatePath(i, j-1) + matrix[i][j];

    #Store and return the optimal (minimum) solution
    if(topCost < leftCost):
        s[i][j] = topCost;
        return topCost;
    else:
        s[i][j] = leftCost;
        return leftCost;

To print the solution, we write:

print CalculatePath(n-1, n-1);

For the code you can go to my Github repository (I read the matrix from a text file).