Alias抽样算法

Alias 算法抽样

简述

对于一个只有两种情况的情景抽样,很简单,例如A的概率是0.4,B的概率是0.6,那么只需要随机数小于0.4就是A,否则是B。

如果扩展多个情况,例如A,B,C的概率分别是0.2,0.3,0.5,一般情况O(1)的方法是生成一个数组[A,A,B,B,B,C,C,C,C,C,C],对数组进行随机抽取,这样比较耗费内存。

Alias也可以在O(1)的方法下取样,并且内存需要更小。
 
还是以上面的A,B,C为例子

  • 把A,B,C的概率 X 3 得到[0.6, 0.9, 1.5]
  • 把它们拆分拼成3组,例如,第一组 0.6的A+0.4的B,第二组0.5的B加0.5的C,第三组1.0的C。
  • 随机抽取三组中的一组后,就类似于只有两种情况,再随机抽取就能取到。

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
import random


def alias_setup(probs):
K = len(probs)
q = [0] * K
J = [0] * K

# Sort the data into the outcomes with probabilities
# that are larger and smaller than 1/K.
smaller = []
larger = []
for kk, prob in enumerate(probs):
q[kk] = K * prob
if q[kk] < 1.0:
smaller.append(kk)
else:
larger.append(kk)

# Loop though and create little binary mixtures that
# appropriately allocate the larger outcomes over the
# overall uniform mixture.
while len(smaller) > 0 and len(larger) > 0:
small = smaller.pop()
large = larger.pop()

J[small] = large
q[large] = q[large] - (1.0 - q[small])

if q[large] < 1.0:
smaller.append(large)
else:
larger.append(large)

return J, q


def alias_draw(J, q):
K = len(J)

# Draw from the overall uniform mixture.
kk = int(random.random() * K)

# Draw from the binary mixture, either keeping the
# small one, or choosing the associated larger one.
if random.random() < q[kk]:
return kk
else:
return J[kk]


K = 5
N = 1000

# Get a random probability vector.
probs = [random.random() for i in range(K)]

# Normalize the probability vector
probs = [i / sum(probs) for i in probs]

# Construct the table.
J, q = alias_setup(probs)

# Generate variates.
X = [0] * N
for nn in range(N):
X[nn] = alias_draw(J, q)
print(probs)
print(X)

参考

https://lips.cs.princeton.edu/the-alias-method-efficient-sampling-with-many-discrete-outcomes/