Solution to a problem posted here:

http://stackoverflow.com/questions/36455104/create-a-random-order-of-x-y-pairs-without-repeating-subsequent-xs#

Say I have a list of valid X = [1, 2, 3, 4, 5] and a list of valid Y = [1, 2, 3, 4, 5].

I need to generate all combinations of every element in X and every element in Y (in this case, 25) and get those combinations in random order.

This in itself would be simple, but there is an additional requirement: In this random order, there cannot be a repetition of the same x in succession. For example, this is okay:

[1, 3]
[2, 5]
[1, 2]
...
[1, 4]

This is not:

[1, 3]
[1, 2]  <== the "1" cannot repeat, because there was already one before
[2, 5]
...
[1, 4]

Now, the least efficient idea would be to simply randomize the full set as long as there are no more repetitions. My approach was a bit different, repeatedly creating a shuffled variant of X, and a list of all Y * X, then picking a random next one from that. So far, I've come up with this:

[...]

But I'm sure this can be done even more efficiently or in a more succinct way?

Also, my solution first goes through all X values before continuing with the full set again, which is not perfectly random. I can live with that for my particular application case.

My solution uses NumPy



In [1]:

    
import numpy as np

Here are some example values for x and y. I assume that there are no repeated values in x.



In [2]:

    
xs = np.arange(10, 14)
ys = np.arange(20, 25)
print(xs, ys)









    



[10 11 12 13] [20 21 22 23 24]

indices is the list of indices I'll choose from at random:



In [3]:

    
n = len(xs)
m = len(ys)
indices = np.arange(n)

Now I'll make an array to hold the values of y:



In [4]:

    
array = np.tile(ys, (n, 1))
print(array)









    



[[20 21 22 23 24]
 [20 21 22 23 24]
 [20 21 22 23 24]
 [20 21 22 23 24]]

And shuffle the rows independently



In [5]:

    
[np.random.shuffle(array[i]) for i in range(n)] 
print(array)









    



[[21 23 22 24 20]
 [22 23 20 21 24]
 [23 20 21 22 24]
 [20 24 21 22 23]]

I'll keep track of how many unused ys there are in each row



In [6]:

    
counts = np.full_like(xs, m)
print(counts)

Now I'll choose a row, using the counts as weights



In [7]:

    
weights = np.array(counts, dtype=float)
weights /= np.sum(weights)
print(weights)









    



[ 0.25  0.25  0.25  0.25]

i is the row I chose, which corresponds to a value of x.



In [8]:

    
i = np.random.choice(indices, p=weights)
print(i)

Now I decrement the counter associated with i, assemble a pair by choosing a value of x and a value of y.

I also clobber the array value I used, which is not necessary, but helps with visualization.



In [9]:

    
counts[i] -= 1
pair = xs[i], array[i, counts[i]]
array[i, counts[i]] = -1
print(pair)

We can check that the counts got decremented



In [10]:

    
print(counts)

And one of the values in array got used



In [11]:

    
print(array)









    



[[21 23 22 24 20]
 [22 23 20 21 24]
 [23 20 21 22 -1]
 [20 24 21 22 23]]

The next time through is almost the same, except that when we assemble the weights, we give zero weight to the index we just used.



In [12]:

    
weights = np.array(counts, dtype=float)
weights[i] = 0
weights /= np.sum(weights)
print(weights)









    



[ 0.33333333  0.33333333  0.          0.33333333]

Everything else is the same



In [13]:

    
i = np.random.choice(indices, p=weights)
counts[i] -= 1
pair = xs[i], array[i, counts[i]]
array[i, counts[i]] = -1
print(pair)



In [14]:

    
print(counts)



In [15]:

    
print(array)









    



[[21 23 22 24 20]
 [22 23 20 21 -1]
 [23 20 21 22 -1]
 [20 24 21 22 23]]

Now we can wrap all that up in a function, using a special value for i during the first iteration.



In [16]:

    
def generate_pairs(xs, ys):
    n = len(xs)
    m = len(ys)
    indices = np.arange(n)
    
    array = np.tile(ys, (n, 1))
    [np.random.shuffle(array[i]) for i in range(n)]
    
    counts = np.full_like(xs, m)
    i = -1

    for _ in range(n * m):
        weights = np.array(counts, dtype=float)
        if i != -1:
            weights[i] = 0
        weights /= np.sum(weights)

        i = np.random.choice(indices, p=weights)
        counts[i] -= 1
        pair = xs[i], array[i, counts[i]]
        array[i, counts[i]] = -1
        
        yield pair

And here's how it works:



In [17]:

    
for pairs in generate_pairs(xs, ys):
    print(pairs)









    



(13, 24)
(12, 24)
(11, 21)
(12, 22)
(10, 22)
(11, 24)
(10, 21)
(13, 20)
(10, 23)
(13, 21)
(12, 20)
(10, 24)
(11, 22)
(10, 20)
(13, 23)
(12, 23)
(13, 22)
(11, 20)
(12, 21)
(11, 23)

Inside the loop, we have to copy the weights, add them up, and choose a random index using the weights. These are all linear in n. So the overall complexity to generate all pairs is O(n^2 m)



In [ ]: