开发者

Shuffle list, ensuring that no item remains in same position

开发者 https://www.devze.com 2023-04-01 18:30 出处:网络
I want to shuffle a list of unique items, but not do an entirely random shuffle. I need to be sure that no element in the shuffled list is at the same position as in the original list. Thus, if the or

I want to shuffle a list of unique items, but not do an entirely random shuffle. I need to be sure that no element in the shuffled list is at the same position as in the original list. Thus, if the original list is (A, B, C, D, E), this result would be OK: (C, D, B, E, A), but this one would not: (C, E, A, D, B) because "D" is still the fourth item. The list will have at most seven items. Extreme efficiency is not a consideration. I think开发者_运维知识库 this modification to Fisher/Yates does the trick, but I can't prove it mathematically:

function shuffle(data) {
    for (var i = 0; i < data.length - 1; i++) {
        var j = i + 1 + Math.floor(Math.random() * (data.length - i - 1));

        var temp = data[j];
        data[j] = data[i];
        data[i] = temp;
    }
}


You are looking for a derangement of your entries.

First of all, your algorithm works in the sense that it outputs a random derangement, ie a permutation with no fixed point. However it has a enormous flaw (which you might not mind, but is worth keeping in mind): some derangements cannot be obtained with your algorithm. In other words, it gives probability zero to some possible derangements, so the resulting distribution is definitely not uniformly random.

One possible solution, as suggested in the comments, would be to use a rejection algorithm:

  • pick a permutation uniformly at random
  • if it hax no fixed points, return it
  • otherwise retry

Asymptotically, the probability of obtaining a derangement is close to 1/e = 0.3679 (as seen in the wikipedia article). Which means that to obtain a derangement you will need to generate an average of e = 2.718 permutations, which is quite costly.

A better way to do that would be to reject at each step of the algorithm. In pseudocode, something like this (assuming the original array contains i at position i, ie a[i]==i):

for (i = 1 to n-1) {
   do {
      j = rand(i, n)   // random integer from i to n inclusive
   } while a[j] != i   // rejection part
   swap a[i] a[j]
}

The main difference from your algorithm is that we allow j to be equal to i, but only if it does not produce a fixed point. It is slightly longer to execute (due to the rejection part), and demands that you be able to check if an entry is at its original place or not, but it has the advantage that it can produce every possible derangement (uniformly, for that matter).

I am guessing non-rejection algorithms should exist, but I would believe them to be less straight-forward.

Edit:

My algorithm is actually bad: you still have a chance of ending with the last point unshuffled, and the distribution is not random at all, see the marginal distributions of a simulation:

Shuffle list, ensuring that no item remains in same position

An algorithm that produces uniformly distributed derangements can be found here, with some context on the problem, thorough explanations and analysis.

Second Edit:

Actually your algorithm is known as Sattolo's algorithm, and is known to produce all cycles with equal probability. So any derangement which is not a cycle but a product of several disjoint cycles cannot be obtained with the algorithm. For example, with four elements, the permutation that exchanges 1 and 2, and 3 and 4 is a derangement but not a cycle.

If you don't mind obtaining only cycles, then Sattolo's algorithm is the way to go, it's actually much faster than any uniform derangement algorithm, since no rejection is needed.


As @FelixCQ has mentioned, the shuffles you are looking for are called derangements. Constructing uniformly randomly distributed derangements is not a trivial problem, but some results are known in the literature. The most obvious way to construct derangements is by the rejection method: you generate uniformly randomly distributed permutations using an algorithm like Fisher-Yates and then reject permutations with fixed points. The average running time of that procedure is e*n + o(n) where e is Euler's constant 2.71828... That would probably work in your case.

The other major approach for generating derangements is to use a recursive algorithm. However, unlike Fisher-Yates, we have two branches to the algorithm: the last item in the list can be swapped with another item (i.e., part of a two-cycle), or can be part of a larger cycle. So at each step, the recursive algorithm has to branch in order to generate all possible derangements. Furthermore, the decision of whether to take one branch or the other has to be made with the correct probabilities.

Let D(n) be the number of derangements of n items. At each stage, the number of branches taking the last item to two-cycles is (n-1)D(n-2), and the number of branches taking the last item to larger cycles is (n-1)D(n-1). This gives us a recursive way of calculating the number of derangements, namely D(n)=(n-1)(D(n-2)+D(n-1)), and gives us the probability of branching to a two-cycle at any stage, namely (n-1)D(n-2)/D(n-1).

Now we can construct derangements by deciding to which type of cycle the last element belongs, swapping the last element to one of the n-1 other positions, and repeating. It can be complicated to keep track of all the branching, however, so in 2008 some researchers developed a streamlined algorithm using those ideas. You can see a walkthrough at http://www.cs.upc.edu/~conrado/research/talks/analco08.pdf . The running time of the algorithm is proportional to 2n + O(log^2 n), a 36% improvement in speed over the rejection method.

I have implemented their algorithm in Java. Using longs works for n up to 22 or so. Using BigIntegers extends the algorithm to n=170 or so. Using BigIntegers and BigDecimals extends the algorithm to n=40000 or so (the limit depends on memory usage in the rest of the program).


    package io.github.edoolittle.combinatorics;

    import java.math.BigInteger;
    import java.math.BigDecimal;
    import java.math.MathContext;
    import java.util.Random;
    import java.util.HashMap;
    import java.util.TreeMap;

    public final class Derangements {

      // cache calculated values to speed up recursive algorithm
      private static HashMap<Integer,BigInteger> numberOfDerangementsMap 
        = new HashMap<Integer,BigInteger>();
      private static int greatestNCached = -1;

      // load numberOfDerangementsMap with initial values D(0)=1 and D(1)=0
      static {
        numberOfDerangementsMap.put(0,BigInteger.valueOf(1));
        numberOfDerangementsMap.put(1,BigInteger.valueOf(0));
        greatestNCached = 1;
      }

      private static Random rand = new Random();

      // private default constructor so class isn't accidentally instantiated
      private Derangements() { }

      public static BigInteger numberOfDerangements(int n)
        throws IllegalArgumentException {
        if (numberOfDerangementsMap.containsKey(n)) {
          return numberOfDerangementsMap.get(n);
        } else if (n>=2) {
          // pre-load the cache to avoid stack overflow (occurs near n=5000)
          for (int i=greatestNCached+1; i<n; i++) numberOfDerangements(i);
          greatestNCached = n-1;
          // recursion for derangements: D(n) = (n-1)*(D(n-1) + D(n-2))
          BigInteger Dn_1 = numberOfDerangements(n-1);
          BigInteger Dn_2 = numberOfDerangements(n-2);
          BigInteger Dn = (Dn_1.add(Dn_2)).multiply(BigInteger.valueOf(n-1));
          numberOfDerangementsMap.put(n,Dn);
          greatestNCached = n;
          return Dn;
        } else {
          throw new IllegalArgumentException("argument must be >= 0 but was " + n);
        }
      }

      public static int[] randomDerangement(int n)
        throws IllegalArgumentException {

        if (n<2)
          throw new IllegalArgumentException("argument must be >= 2 but was " + n);

        int[] result = new int[n];
        boolean[] mark = new boolean[n];

        for (int i=0; i<n; i++) {
          result[i] = i;
          mark[i] = false;
        }
        int unmarked = n;

        for (int i=n-1; i>=0; i--) {
          if (unmarked<2) break; // can't move anything else
          if (mark[i]) continue; // can't move item at i if marked

          // use the rejection method to generate random unmarked index j < i;
          // this could be replaced by more straightforward technique
          int j;
          while (mark[j=rand.nextInt(i)]);

          // swap two elements of the array
          int temp = result[i];
          result[i] = result[j];
          result[j] = temp;

          // mark position j as end of cycle with probability (u-1)D(u-2)/D(u)
          double probability 
        = (new BigDecimal(numberOfDerangements(unmarked-2))).
        multiply(new BigDecimal(unmarked-1)).
        divide(new BigDecimal(numberOfDerangements(unmarked)),
               MathContext.DECIMAL64).doubleValue();
          if (rand.nextDouble() < probability) {
        mark[j] = true;
        unmarked--;
          }

          // position i now becomes out of play so we could mark it
          //mark[i] = true;
          // but we don't need to because loop won't touch it from now on
          // however we do have to decrement unmarked
          unmarked--;
        }

        return result;
      }

      // unit tests
      public static void main(String[] args) {
        // test derangement numbers D(i)
        for (int i=0; i<100; i++) {
          System.out.println("D(" + i + ") = " + numberOfDerangements(i));
        }
        System.out.println();

        // test quantity (u-1)D_(u-2)/D_u for overflow, inaccuracy
        for (int u=2; u<100; u++) {
          double d = numberOfDerangements(u-2).doubleValue() * (u-1) /
        numberOfDerangements(u).doubleValue();
          System.out.println((u-1) + " * D(" + (u-2) + ") / D(" + u + ") = " + d);
        }

        System.out.println();

        // test derangements for correctness, uniform distribution
        int size = 5;
        long reps = 10000000;
        TreeMap<String,Integer> countMap = new TreeMap<String,Integer>();
        System.out.println("Derangement\tCount");
        System.out.println("-----------\t-----");
        for (long rep = 0; rep < reps; rep++) {
          int[] d = randomDerangement(size);
          String s = "";
          String sep = "";
          if (size > 10) sep = " ";
          for (int i=0; i<d.length; i++) {
        s += d[i] + sep;
          }

          if (countMap.containsKey(s)) {
        countMap.put(s,countMap.get(s)+1);
          } else {
        countMap.put(s,1);
          }
        }

        for (String key : countMap.keySet()) {
          System.out.println(key + "\t\t" + countMap.get(key));
        }

        System.out.println();

        // large random derangement
        int size1 = 1000;
        System.out.println("Random derangement of " + size1 + " elements:");
        int[] d1 = randomDerangement(size1);
        for (int i=0; i<d1.length; i++) {
          System.out.print(d1[i] + " ");
        }

        System.out.println();
        System.out.println();

        System.out.println("We start to run into memory issues around u=40000:");
        {
          // increase this number from 40000 to around 50000 to trigger
          // out of memory-type exceptions
          int u = 40003;
          BigDecimal d = (new BigDecimal(numberOfDerangements(u-2))).
        multiply(new BigDecimal(u-1)).
        divide(new BigDecimal(numberOfDerangements(u)),MathContext.DECIMAL64);
          System.out.println((u-1) + " * D(" + (u-2) + ") / D(" + u + ") = " + d);
        }

      }

    }


In C++:

template <class T> void shuffle(std::vector<T>&arr)
{
    int size = arr.size();

    for (auto i = 1; i < size; i++)
    {
        int n = rand() % (size - i) + i;
        std::swap(arr[i-1], arr[n]);
    }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号