This is part of my interview question series. It's about shuffling arrays.

The Question

You have an array A of size N. Write a routine that shuffles the array in-place. The only restrictions are that all possible permutations of A must be possible and equally likely.

This interview question serves as a test for basic algorithm construction. There's a canonical solution that's not too difficult to arrive at if you've never seen it before, so it's a good combination of "what do you know?" and "what can you do?"

Workin' it out

I'm going to create my solution in Ruby because that's the language the company that asked me this question used.

The first solution most people arrive at is subtly wrong. Jeff Atwood made the mistake in his blog post. The algorithm, in words, goes like this: iterate through each item in the array, pick another element at random, and swap the two.

In Ruby the above algorithm would look like this.

class Array
  def shuffle_naive!
    n = size
    until n == 0
      k = rand(size) #This is the line which proves our undoing
      n = n -1
      self[n], self[k] = self[k], self[n]
    end
  end
end

This solution seems correct if not optimal, but there's a subtle problem: not all outcomes are equally likely.

The root cause of this is because this algorithm is drawing from a sample space of size NN, while the sample space of all permutations on an N-element array is only N!.

That is, for the naive shuffle, for each of the N steps in the iteration we make one of N decisions for a total of NN possible outcomes.

But NN > N! for all N > 1 and, more importantly, N! is not a divisor of NN. This means we're going to prefer at least one of the permutations more than the others, so the algorithm doesn't select among the possible permutations uniformly.

KFC, KFY

The "best" solution is the Knuth-Fischer-Yates shuffle. Here it is in Ruby

class Array
  def shuffle!
    n = size
    until n == 0
      k = rand(n) #You can see I'm doing rand(n) rather than rand(size)
      n = n - 1
      self[n], self[k] = self[k], self[n]
    end
    self
  end
end

This works because it's an iterative version of an essentially recursive algorithm. If we know how to shuffle an array of size N-1 then shuffling an array of size N is easy — first shuffle the sub-array consisting of the first N-1 elements and then randomly swap in the last element to any of the N slots.

There's a proper inductive proof in there if you're so inclined, but it's not particularly illuminating.

Good Questions, Bad Questions

My next article is going to be more about the interview process rather than specific questions. One key thing to understand in an interview is what information the interviewer is looking for in asking their question. Hint: it's not always the answer.

Among other things they want to suss out the limits of your knowledge, how you solve problems, how quickly you resort to help, and a whole assortment of other, behavioral things, that they get because you're right there (hopefully) engaging in a dialogue.

9 Comments

  1. Chris April 9th, 2008 / 12:50 pm

    I enjoy reading the posts on this blog. The posts related to programming are entertaining and help exercise my brain.

    >> My next article is going to be more about the interview process rather than specific questions

    Regarding the interview process, I’ve found this post very helpful.
    http://www.joelonsoftware.com/articles/GuerrillaInterviewing3.html
    Joel Spolsky has also written a book (”Smart and Gets Things Done”) which elaborates on the same ideas in the blog post.

  2. phil April 10th, 2008 / 3:38 am

    For yet another Ruby implementation of Array#shuffle see Random numbers in Ruby, http://snippets.dzone.com/posts/show/4697

    Nice post!

  3. cc April 10th, 2008 / 4:15 am

    Note: - I could be wrong here but I think you’re code example#1 has a typo, you define variable r and don’t use it, and use variable k and don’t define it

    and shouldn’t it be n= self.size in example#1 as well as example#2?

  4. Jesse April 10th, 2008 / 11:00 am

    cc,

    You’re right about the first point, but not the second. Both self.size and size work because we’re inside the Array instance.

  5. rue April 10th, 2008 / 3:06 pm

    Excellent note about the algorithm.

    However, since pragmatism trumps all, the “right” way to shuffle an Array in Ruby is this:

    class Array
    def shuffle!
    replace(array.sort_by { rand })
    end
    end

    As a bonus, it is a language/core lib bug and not your fault if there is a problem with the distribution ;)

  6. Jesse April 10th, 2008 / 7:39 pm

    rue,

    That’s not really true, though. In fact, that “solution” violates multiple requirements of the problems.

    One, it’s not in-place. Two, whether or not the distribution is uniform depends on the sorting algorithm that sort_by uses.

  7. phil April 11th, 2008 / 2:35 am
  8. Andy April 12th, 2008 / 8:14 am

    What if the source and destination are picked using a random number generator N times? Given that the generator is unifrom, aren’t all permutations possible?

  9. Hanna August 26th, 2008 / 1:08 pm

    Hi,
    Great effort after reading this I really good very nice point how to shuffle array keep it up
    and post more article also check Pre Developer

    Thanks

Leave a Reply