Lazy Larry's Random Algorithm

Probability Level 4

Lazy Larry was assigned to create a bubble sort algorithm, but he didn't have the time to do it properly. He designed the following algorithm instead:

For each iteration of the algorithm:

$\vphantom{1}$

Choose two distinct elements in the list uniformly at random.

Exchange the placements of those elements in the list.

Check to see if the list is now sorted. If it is, end the algorithm. Otherwise, do another iteration.

Lazy Larry tests his algorithm on the following list:

capybara, dingo, aardvark, bandicoot

What is the expected value of the number of iterations of Larry's algorithm to alphabetize the list (A to Z)?

The answer is 27.

2 solutions

Andy Hayes
Jul 19, 2017

There are $4!=24$ permutations of the list, and each is possible with Larry's algorithm. This is a lot to analyze, but fortunately, we can organize these permutations into a set of "states":

State P: Exactly two of the elements in the list are out of order. Example: bandicoot, aardvark, capybara, dingo
State Q: All four of the elements in the list are out of order, but they can be placed in order by switching two pairs. This is the state the list is in before Larry runs the algorithm: capybara, dingo, aardvark, bandicoot
State R: Exactly three of the elements in the list are out of order. Example: capybara, aardvark, bandicoot, dingo
State S: Exactly four of the elements are out of order, and the order cannot be restored by switching two pairs. Example: bandicoot, capybara, dingo, aardvark
State T: All four of the elements are in order: aardvark, bandicoot, capybara, dingo

Now we can analyze how we can move between states with each iteration of the algorithm. There are 6 possible switches with each iteration. Suppose we are in state P, like the example given above: bandicoot, aardvark, capybara, dingo

There is 1 switch that will bring us to state Q:

bandicoot, aardvark, capybara, dingo $\Rightarrow$ bandicoot, aardvark, dingo, capybara

There are 4 switches that will bring us to state R:

bandicoot, aardvark, capybara, dingo $\Rightarrow$ capybara, aardvark, bandicoot, dingo

bandicoot, aardvark, capybara, dingo $\Rightarrow$ dingo, aardvark, capybara, bandicoot

bandicoot, aardvark, capybara, dingo $\Rightarrow$ bandicoot, capybara, aardvark, dingo

bandicoot, aardvark, capybara, dingo $\Rightarrow$ bandicoot, dingo, capybara, aardvark

There is 1 switch that will bring us to state T:

bandicoot, aardvark, capybara, dingo $\Rightarrow$ aardvark, bandicoot, capybara, dingo

This gives us probabilities of $\frac{1}{6},$ $\frac{2}{3},$ and $\frac{1}{6}$ to go from state P to states Q, R, and T, respectively. The other states can be likewise analyzed. Below is a graph showing their relationships:

Let $X$ be the number of iterations until we reach state T. Then we have the expected values:

$\begin{aligned} p = \text{E}[X \mid P] \\ q = \text{E}[X \mid Q] \\ r = \text{E}[X \mid R] \\ s = \text{E}[X \mid S] \\ \end{aligned}$

We can write a system of equations using linearity of expectation .

$\begin{aligned} p &= 1+\frac{1}{6}q+\frac{2}{3}r \\ q &= 1+\frac{1}{3}p+\frac{2}{3}s \\ r &= 1+\frac{1}{2}p+\frac{1}{2}s \\ s &= 1+\frac{1}{3}q+\frac{2}{3}r \\ \end{aligned}$

Solving this system gives $q=27,$ and so the expected value of the number of iterations from the initial list is $\boxed{27}.$

How can we generalize this result?

Agnishom Chattopadhyay - 3 years, 10 months ago

Well the states of the Markov chain are the cycle types of $S_4$ (identity, $2$ -cycles, $3$ -cycles, $4$ -cycles, $2\times2$ -cycles). We determine the transition probabilities by considering what happens to a particular cycle type when composed with a transposition. There are $6$ possible transpositions, and we need to check all the options. For example, if we start with a $2$ -cycle:

there is one $2$ -cycle which will multiply it to yield the identity,
there is one $2$ -cycle which will multiply it to yield a $2\times2$ -cycle
all the other four $2$ -cycles yield a $3$ -cycle

These ideas give you the transition probabilities from Andy's State P.

We could play the same game with any other permutation group $S_n$ , but the options would get increasingly complicated. There is some ease to be gained from the fact that an odd permutation state can only transition to an even permutation state, and conversely.

Mark Hennings - 3 years, 10 months ago

Not understand

Waqar Ahmed - 3 years, 10 months ago

good problem

Jim Lawrence - 3 years, 10 months ago

Not understand

Waqar Ahmed - 3 years, 10 months ago

Heh. Nice "sorting" algorithm. A slightly easier version of the problem: instead of taking a random transposition from the group of permutations on four objects, just take a random member of the group, i.e. randomly shuffle the order of the list.
It might seem like this is a slower approach, but it turns out to be just a tad bit faster in expected stopping time. (And computing the expected stopping time is much simpler.)

Richard Desper - 3 years, 10 months ago

A lot easier. You want the expectation of the geometric distribution $\mathrm{Geo}(\tfrac{1}{24})$ , which is $24$ . There is no need to introduce Markov chains.

Mark Hennings - 3 years, 10 months ago

Exactly correct. No matter what the sequence is at any point there is exactly one permutation that will immediately sort it.

Richard Desper - 3 years, 10 months ago

%MATLAB CODE n = 1e5; operations = zeros(1,n); for i = 1:n disp(i) list = [3 4 1 2]; while any(diff(list) < 0) order = randperm(4); tmp = list(order(1)); list(order(1)) = list(order(2)); list(order(2)) = tmp; operations(i) = operations(i) + 1; end end mean(operations) == ~ 27

So the answer is probably 27

Greg Norgard - 3 years, 10 months ago

Larry's algorithm

Aomths Crokzowskmi - 3 years, 10 months ago

Piotr Idzik
Mar 24, 2020

Below you will find my code doing a brute-force Monte-Carlo simulation. The code uses multi-threading.

//Monte-Carlo simulation for https://brilliant.org/problems/lazy-larrys-random-algorithm/

#include <iostream>
#include <string>
#include <random>
#include <vector>
#include <chrono>
#include <thread>
#include <ctime>
#include <algorithm>

template <typename DataType, typename RandomEngineType>
std::uintmax_t randomSort(const DataType& inData, RandomEngineType& randomEngine)
{
    DataType curData = inData;
    std::uintmax_t curStep = 0;
    const std::size_t dataLen = inData.size();
    std::uniform_int_distribution<> posGenerator(0, dataLen-1);
    while(!std::is_sorted(curData.begin(), curData.end()))
    {
        const std::size_t posA = posGenerator(randomEngine);
        std::size_t posB = posGenerator(randomEngine);
        while(posA == posB)
        {
            posB = posGenerator(randomEngine);
        }
        const auto tmpVal = curData[posA];
        curData[posA] = curData[posB];
        curData[posB] = tmpVal;
        ++curStep;
    }
    return curStep;
}

template <typename DataType, typename RandomEngineType>
std::uintmax_t randomSortMcSimulation(const DataType& testData,
                                      const std::uintmax_t localMaxIter,
                                      const int inSeed)
{
    RandomEngineType randomEngine(inSeed);
    std::uintmax_t totalSum = 0;
    for(std::uintmax_t curIter = 0; curIter < localMaxIter; ++curIter)
    {
        totalSum += randomSort(testData, randomEngine);
    }
    return totalSum;
}

template<typename DataType, typename RandomEngineType>
class Worker
{
    private:
        DataType testData;
        std::uintmax_t localIter;
        int seed;
        std::uintmax_t totalSum;
    public:
        explicit Worker(const DataType& _testData,
                        const std::uintmax_t _localIter,
                        const int _seed) :
        testData(_testData),
        localIter(_localIter),
        seed(_seed),
        totalSum()
        {}
        void operator()()
        {
            this->totalSum = randomSortMcSimulation<DataType, RandomEngineType>(this->testData,
                                                                                this->localIter,
                                                                                this->seed);
        }
        std::uintmax_t GetTotalSum() const
        {
            return this->totalSum;
        }
};

template<typename UnsignedIntType>
UnsignedIntType gcd(const UnsignedIntType _a, UnsignedIntType _b)
{
    UnsignedIntType a, b;
    if(_a > _b)
    {
        a = _a;
        b = _b;
    }
    else
    {
        a = _b;
        b = _a;
    }
    while(b > 0)
    {
        const UnsignedIntType div = a%b;
        a = b;
        b = div;
    }
    return a;
}

template <typename DataType, typename RandomEngineType>
long double randomSortMcSimulation(const DataType& testData,
                                   const std::uintmax_t totalIter,
                                   const unsigned numberOfThreads)
{
    if(totalIter%std::uintmax_t(numberOfThreads) != 0)
    {
        throw std::invalid_argument("totalIter must be a multiple of numberOfThreads.");
    }
    const std::uintmax_t localIter = totalIter/std::uintmax_t(numberOfThreads);
    std::vector<std::thread> threadVector;
    threadVector.reserve(numberOfThreads);
    std::vector<Worker<DataType, RandomEngineType> > workerVector;
    workerVector.reserve(numberOfThreads);
    for(unsigned i = 0; i < numberOfThreads; ++i)
    {
        workerVector.push_back(Worker<DataType, RandomEngineType>(testData, localIter, i+time(NULL)));
        threadVector.push_back(std::thread(std::ref(workerVector[i])));
    }
    for(auto & th : threadVector)
    {
        th.join();
    }

    auto sumFun = [](const std::uintmax_t tmpSum, const Worker<DataType, RandomEngineType>& inWorker)->std::uintmax_t
    {
        return tmpSum+inWorker.GetTotalSum();
    };
    const std::uintmax_t totalSum = std::accumulate(workerVector.begin(), workerVector.end(), 0ULL, sumFun);
    const std::uintmax_t div = gcd(totalSum, totalIter);
    return static_cast<long double>(totalSum/div)/static_cast<long double>(totalIter/div);
}

int main()
{
    const unsigned numberOfThreads = std::thread::hardware_concurrency();
    const std::uintmax_t totalIter = std::uintmax_t(numberOfThreads)*5000000ULL;
    std::vector<std::string> testData{"capybara", "dingo", "aardvark", "bandicoot"};

    auto startTime = std::chrono::steady_clock::now();
    const auto res = randomSortMcSimulation<std::vector<std::string>, std::mt19937_64>(testData, totalIter, numberOfThreads);
    auto endTime = std::chrono::steady_clock::now();
    std::chrono::duration<double> runTime = endTime-startTime;
    std::cout<<"Runtime: "<<runTime.count()<<" [s]"<<std::endl;
    std::cout<<"Expected number of steps: "<<res<<std::endl;
    return 0;
}

Lazy Larry's Random Algorithm

The answer is 27.

2 solutions

0 pending reports