This is part of my interview question series. It's about shuffling arrays.
The Question
You have an array A of size N. Write a routine that shuffles the array in-place. The only restrictions are that all possible permutations of A must be possible and equally likely.
This interview question serves as a test for basic algorithm construction. There's a canonical solution that's not too difficult to arrive at if you've never seen it before, so it's a good combination of "what do you know?" and "what can you do?"
Workin' it out
I'm going to create my solution in Ruby because that's the language the company that asked me this question used.
The first solution most people arrive at is subtly wrong. Jeff Atwood made the mistake in his blog post. The algorithm, in words, goes like this: iterate through each item in the array, pick another element at random, and swap the two.
In Ruby the above algorithm would look like this.
def shuffle_naive!
n = size
until n == 0
k = rand(size) #This is the line which proves our undoing
n = n -1
self[n], self[k] = self[k], self[n]
end
end
end
This solution seems correct if not optimal, but there's a subtle problem: not all outcomes are equally likely.
The root cause of this is because this algorithm is drawing from a sample space of size NN, while the sample space of all permutations on an N-element array is only N!.
That is, for the naive shuffle, for each of the N steps in the iteration we make one of N decisions for a total of NN possible outcomes.
But NN > N! for all N > 1 and, more importantly, N! is not a divisor of NN. This means we're going to prefer at least one of the permutations more than the others, so the algorithm doesn't select among the possible permutations uniformly.
KFC, KFY
The "best" solution is the Knuth-Fischer-Yates shuffle. Here it is in Ruby
def shuffle!
n = size
until n == 0
k = rand(n) #You can see I'm doing rand(n) rather than rand(size)
n = n - 1
self[n], self[k] = self[k], self[n]
end
self
end
end
This works because it's an iterative version of an essentially recursive algorithm. If we know how to shuffle an array of size N-1 then shuffling an array of size N is easy — first shuffle the sub-array consisting of the first N-1 elements and then randomly swap in the last element to any of the N slots.
There's a proper inductive proof in there if you're so inclined, but it's not particularly illuminating.
Good Questions, Bad Questions
My next article is going to be more about the interview process rather than specific questions. One key thing to understand in an interview is what information the interviewer is looking for in asking their question. Hint: it's not always the answer.
Among other things they want to suss out the limits of your knowledge, how you solve problems, how quickly you resort to help, and a whole assortment of other, behavioral things, that they get because you're right there (hopefully) engaging in a dialogue.
I enjoy reading the posts on this blog. The posts related to programming are entertaining and help exercise my brain.
>> My next article is going to be more about the interview process rather than specific questions
Regarding the interview process, I’ve found this post very helpful.
http://www.joelonsoftware.com/articles/GuerrillaInterviewing3.html
Joel Spolsky has also written a book (”Smart and Gets Things Done”) which elaborates on the same ideas in the blog post.
For yet another Ruby implementation of Array#shuffle see Random numbers in Ruby, http://snippets.dzone.com/posts/show/4697
Nice post!
Note: - I could be wrong here but I think you’re code example#1 has a typo, you define variable r and don’t use it, and use variable k and don’t define it
and shouldn’t it be n= self.size in example#1 as well as example#2?
cc,
You’re right about the first point, but not the second. Both self.size and size work because we’re inside the Array instance.
Excellent note about the algorithm.
However, since pragmatism trumps all, the “right” way to shuffle an Array in Ruby is this:
class Array
def shuffle!
replace(array.sort_by { rand })
end
end
As a bonus, it is a language/core lib bug and not your fault if there is a problem with the distribution ;)
rue,
That’s not really true, though. In fact, that “solution” violates multiple requirements of the problems.
One, it’s not in-place. Two, whether or not the distribution is uniform depends on the sorting algorithm that sort_by uses.
rue, see here: http://eigenclass.org/hiki.rb?sort_by+rand+is+biased
What if the source and destination are picked using a random number generator N times? Given that the generator is unifrom, aren’t all permutations possible?
Hi,
Great effort after reading this I really good very nice point how to shuffle array keep it up
and post more article also check Pre Developer
Thanks