## Interview Questions: Shuffling an Array

by Jesse Farmer on Tuesday, April 15, 2008

This is part of my interview question series. It's about shuffling arrays.

### The Question

You have an array A of size N. Write a routine that shuffles the array in-place. The only restrictions are that all possible permutations of A must be possible and equally likely.

This interview question serves as a test for basic algorithm construction. There's a canonical solution that's not too difficult to arrive at if you've never seen it before, so it's a good combination of "what do you know?" and "what can you do?"

### Workin' it out

I'm going to create my solution in Ruby because that's the language the company that asked me this question used.

The first solution most people arrive at is subtly wrong. Jeff Atwood made the mistake in his blog post. The algorithm, in words, goes like this: iterate through each item in the array, pick another element at random, and swap the two.

In Ruby the above algorithm would look like this.

```class Array
def shuffle_naive!
n = size
until n == 0
k = rand(size) #This is the line which proves our undoing
n = n -1
self[n], self[k] = self[k], self[n]
end
end
end```

This solution seems correct if not optimal, but there's a subtle problem: not all outcomes are equally likely.

The root cause of this is because this algorithm is drawing from a sample space of size NN, while the sample space of all permutations on an N-element array is only N!.

That is, for the naive shuffle, for each of the N steps in the iteration we make one of N decisions for a total of NN possible outcomes.

But NN > N! for all N > 1 and, more importantly, N! is not a divisor of NN. This means we're going to prefer at least one of the permutations more than the others, so the algorithm doesn't select among the possible permutations uniformly.

### KFC, KFY

The "best" solution is the Knuth-Fischer-Yates shuffle. Here it is in Ruby

```class Array
def shuffle!
n = size
until n == 0
k = rand(n) #You can see I'm doing rand(n) rather than rand(size)
n = n - 1
self[n], self[k] = self[k], self[n]
end
self
end
end```

This works because it's an iterative version of an essentially recursive algorithm. If we know how to shuffle an array of size N-1 then shuffling an array of size N is easy — first shuffle the sub-array consisting of the first N-1 elements and then randomly swap in the last element to any of the N slots.

There's a proper inductive proof in there if you're so inclined, but it's not particularly illuminating.