Python 如何从列表中随机删除一定百分比的项目

Python 如何从列表中随机删除一定百分比的项目,python,list,Python,List,我有两个长度相等的列表,一个是数据序列,另一个只是时间序列。它们表示随时间测量的模拟值 我想创建一个函数,从两个列表中随机删除一组百分比或分数。也就是说,如果我的分数是0.2,我想从两个列表中随机删除20%的项目,但必须删除相同的项目(每个列表中的索引相同) 例如,设n=0.2(删除20%) 在随机删除20%后,它们成为 a_new = [0,1,3,4,5,6,8,9] b_new = [0,1,9,16,25,36,64,81] 这种关系不像示例那样简单,所以我不能只在一个列表上执行此操作

我有两个长度相等的列表,一个是数据序列,另一个只是时间序列。它们表示随时间测量的模拟值

我想创建一个函数,从两个列表中随机删除一组百分比或分数。也就是说,如果我的分数是0.2,我想从两个列表中随机删除20%的项目,但必须删除相同的项目(每个列表中的索引相同)

例如,设n=0.2(删除20%)

在随机删除20%后,它们成为

a_new = [0,1,3,4,5,6,8,9]
b_new = [0,1,9,16,25,36,64,81]
这种关系不像示例那样简单,所以我不能只在一个列表上执行此操作,然后再计算第二个列表;它们已经作为两个列表存在。它们必须保持原来的顺序


谢谢

如果
a
b
不是很大,您可以使用
zip

import random

a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]

frac = 0.2  # how much of a/b do you want to exclude

# generate a list of indices to exclude. Turn in into a set for O(1) lookup time
inds = set(random.sample(list(range(len(a))), int(frac*len(a))))

# use `enumerate` to get list indices as well as elements. 
# Filter by index, but take only the elements
new_a = [n for i,n in enumerate(a) if i not in inds]
new_b = [n for i,n in enumerate(b) if i not in inds]
import random

a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]

frac = 0.2  # how much of a/b do you want to exclude
ab = list(zip(a,b))  # a list of tuples where the first element is from `a` and the second is from `b`

new_ab = random.sample(ab, int(len(a)*(1-frac)))  # sample those tuples
new_a, new_b = zip(*new_ab)  # unzip the tuples to get `a` and `b` back
请注意,这不会保留
a
b

的原始顺序。您还可以操作压缩后的a和b序列,获取索引的随机样本(以保持项目的原始顺序),然后再次解压缩到
a_new
b_new

import random


a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]

frac = 0.2

c = zip(a, b)  # c = list(zip(a, b)) on Python 3
indices = random.sample(range(len(c)), frac * len(c))
a_new, b_new = zip(*sorted(c[i] for i in sorted(indices)))

print(a_new)
print(b_new)
它可以打印:

(0, 2, 3, 5, 6, 7, 8, 9)
(0, 4, 9, 25, 36, 49, 64, 81)

最简单的答案是:删除列表的第一个或最后一个x%,我希望它是一个随机样本,而不仅仅是一个缺少结尾/开头的块。删除第一个或最后一个并不意味着两个列表中的索引都是相同的,可能类似于使用
random.sample(pop,k)
获得一个列表的百分比,然后使用
index()
计算出删除的项目的索引,并将它们从另一个列表中删除。它将-数据序列和时间序列
(0, 2, 3, 5, 6, 7, 8, 9)
(0, 4, 9, 25, 36, 49, 64, 81)
import random

a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]

frac = 0.2  # how much of a/b do you want to exclude

new_a, new_b = [], []

for i in range(len(a)):
    if random.random()>frac:  # with probability, add an element from `a` and `b` to the output
        new_a.append(a[i])
        new_b.append(b[i])
l = len(a)
n_drop = int(l * n)
n_keep = l - n_drop
ind = [1] * n_keep + [0] * n_drop
random.shuffle(ind)
new_a = [ e for e, i in zip(a, ind) if i ]
new_b = [ e for e, i in zip(b, ind) if i ]
from random import randint as r

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
b = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

percentage = 0.3

g = (r(0, len(a)-1) for _ in xrange(int(len(a) * (1-percentage))))

c, d = [], []
for i in g:
    c.append(a[i])
    d.append(b[i])

a, b = c, d

print a
print b