Python 在While循环中查找数据帧中的特定数据行我正在尝试获取csv，并将其作为数据帧读取此数据帧包含4行数字我想从数据框中选取一行特定的数据在While循环中，我想从数据帧中选择一个随机行，并将其与我选择的行进行比较我希望它继续在while循环中运行，直到该随机行100%等于我之前拾取的行然后我希望While循环中断，我希望它计算出匹配随机数所需的尝试次数_Python_Pandas_Dataframe_Row

Python 在While循环中查找数据帧中的特定数据行我正在尝试获取csv，并将其作为数据帧读取此数据帧包含4行数字我想从数据框中选取一行特定的数据在While循环中，我想从数据帧中选择一个随机行，并将其与我选择的行进行比较我希望它继续在while循环中运行，直到该随机行100%等于我之前拾取的行然后我希望While循环中断，我希望它计算出匹配随机数所需的尝试次数

python pandas dataframe

Python 在While循环中查找数据帧中的特定数据行我正在尝试获取csv，并将其作为数据帧读取此数据帧包含4行数字我想从数据框中选取一行特定的数据在While循环中，我想从数据帧中选择一个随机行，并将其与我选择的行进行比较我希望它继续在while循环中运行，直到该随机行100%等于我之前拾取的行然后我希望While循环中断，我希望它计算出匹配随机数所需的尝试次数,python,pandas,dataframe,row,Python,Pandas,Dataframe,Row,以下是我目前掌握的情况：这是数据帧的一个示例： A B C D 1 2 7 12 14 2 4 5 11 23 3 4 6 14 20 4 4 7 13 50 5 9 6 14 35 以下是我努力的一个例子： import time import pandas as pd then = time.time() count = 0 df = pd.read_csv('Get_Numbers.csv') df.columns = ['A

以下是我目前掌握的情况：

这是数据帧的一个示例：

    A  B  C  D
1   2  7  12 14
2   4  5  11 23
3   4  6  14 20
4   4  7  13 50
5   9  6  14 35

以下是我努力的一个例子：

import time
import pandas as pd

then = time.time()

count = 0

df = pd.read_csv('Get_Numbers.csv')
df.columns = ['A', 'B', 'C', 'D']

while True:
    df_elements = df.sample(n=1)
    random_row = df_elements
    print(random_row)
    find_this_row = df['A','B','C','D' == '4','7','13,'50']
    print(find_this_row)
    if find_this_row != random_row:
        count += 1
    else:
        break

print("You found the correct numbers! And it only took " + str(count) + " tries to get there! Your numbers were: " + str(find_this_row))

now = time.time()

print("It took: ", now-then, " seconds")

上面的代码给出了一个明显的错误。。。但是我现在已经尝试了很多不同版本的查找

find_this_row

数字，我不知道该怎么做了，所以我把这个尝试留了下来

我想要避免的是对我试图查找的行使用特定索引，我宁愿只使用值来查找它

我使用

df\u elements=df.sample（n=1）

随机选择一行。这是为了避免使用

random.choice

，因为我不确定这样做是否有效，或者哪种方式更节省时间/内存，但我也愿意接受这方面的建议

在我看来，随机选择一行数据似乎很简单，如果它与我想要的数据行不匹配，则继续随机选择数据行，直到匹配为止。但我似乎无法执行它

非常感谢您的帮助

使用

值如何
值
将返回值列表。然后你可以很容易地比较两个列表
list1==list2
在比较相应列表的索引时，将返回一个包含True
和False
值的数组。您可以检查是否所有返回的值都是True
您可以使用shape=（1，2）
的np.ndarray
返回值，使用值[0]
仅获得1D数组
然后将数组与any（）

下面是一个一次测试一行的方法。我们检查所选行的值是否等于采样的数据帧的值。我们要求它们全部匹配
row = df.sample(1)

counter = 0
not_a_match = True

while not_a_match:
    not_a_match = ~(df.sample(n=1).values == row.values).all()
    counter+=1

print(f'It took {counter} tries and the numbers were\n{row}')
#It took 9 tries and the numbers were
#   A  B   C   D
#4  4  7  13  50


如果您想快一点，可以选择一行，然后多次对数据帧进行替换采样。然后，您可以第一次检查采样的行是否等于采样的数据帧
，给出while循环中的“尝试”次数，但时间要少得多。循环可以防止出现我们无法找到匹配的情况，因为它是带替换的采样
row = df.sample(1)

n = 0
none_match = True
k = 10  # Increase to check more matches at once.

while none_match:
    matches = (df.sample(n=len(df)*k, replace=True).values == row.values).all(1)
    none_match = ~matches.any()  # Determine if none still match
    n += k*len(df)*none_match  # Only increment if none match
n = n + matches.argmax() + 1

print(f'It took {n} tries and the numbers were\n{row}')
#It took 3 tries and the numbers were
#   A  B   C   D
#4  4  7  13  50

首先是一些提示。这一行对我不适用：
find_this_row = df['A','B','C','D' == '4','7','13,'50']

原因有二：

在'13'之后缺少一个“'”
df是DataFrame（），因此不支持使用下面这样的键

df['A'，'B'，'C'，'D'
使用键返回DataFrame（）：
或作为一个系列（）
由于需要整行多列，请执行以下操作：
df2.iloc[4].values

数组（['4'，'7'，'13'，'50']，dtype=object）
对示例行执行相同的操作：
df2.sample(n=1).values

需要对所有（）元素/列进行行间比较：
df2.sample(n=1).values == df2.iloc[4].values

数组（[[True，False，False，False]]）
添加.all（），如下所示：
(df2.sample(n=1).values == df2.iloc[4].values).all()

返回
对/错
总而言之：
import time
import pandas as pd

then = time.time()
count = 0
while True:
    random_row = df2.sample(n=1).values
    find_this_row = df2.iloc[4].values
    if (random_row == find_this_row).all() == False:
        count += 1
    else:
        break

print("You found the correct numbers! And it only took " + str(count) + " tries to get there! Your numbers were: " + str(find_this_row))

now = time.time()

print("It took: ", now-then, " seconds")

您想在数据框中进行替换还是不进行替换的采样？（例如，可以尝试超过数据框的行数）是的，我希望尝试能够超过我的数据框中的行数。我正在从下面阅读您的答案。它肯定使用了一些我不熟悉的东西（在我看来，我还是一个很大的初学者）但它似乎得到了我想要的结果！我很抱歉，我试图解决一个不存在的问题。我为你的实际问题添加了一个解决方案，然后，如果时间是一个问题，你会如何考虑以更少的循环方式做这件事。谢谢！这让我开始跑步！非常感谢！谢谢你他的回答是，在你的答案和阿洛兹的答案之间有一个折腾。他们都为我所做的努力工作！
df2.sample(n=1).values == df2.iloc[4].values

(df2.sample(n=1).values == df2.iloc[4].values).all()

import time
import pandas as pd

then = time.time()
count = 0
while True:
    random_row = df2.sample(n=1).values
    find_this_row = df2.iloc[4].values
    if (random_row == find_this_row).all() == False:
        count += 1
    else:
        break

print("You found the correct numbers! And it only took " + str(count) + " tries to get there! Your numbers were: " + str(find_this_row))

now = time.time()

print("It took: ", now-then, " seconds")