Python 如何从数据库返回多行数据？_Python_Python 3.x_Pandas

Python 如何从数据库返回多行数据？

python python-3.x pandas

Python 如何从数据库返回多行数据？,python,python-3.x,pandas,Python,Python 3.x,Pandas,我正在对excel文件的数千行进行类似SQL的搜索。我正在使用的是工作，但我现在需要返回所有值，并且很难弄清楚 | Col1 | Col2 | Col3 | Col4 | Col5 | Col6 | |------|------|------|------|------|------| |123467|123231|134521|Data00|Info00|Here00| |976443|1224ff|14xec1|Data01|Info01|Here01| |123467|12wf41|34q

我正在对excel文件的数千行进行类似SQL的搜索。我正在使用的是工作，但我现在需要返回所有值，并且很难弄清楚

| Col1 | Col2 | Col3 | Col4 | Col5 | Col6 |
|------|------|------|------|------|------|
|123467|123231|134521|Data00|Info00|Here00|
|976443|1224ff|14xec1|Data01|Info01|Here01|
|123467|12wf41|34qqa1|Data02|Info02|Here02|

我正在使用：

boolean = []
entry = 123467
dictionary_df = pd.read_excel(my_xlsx)
for col in range(3):
    for row in dictionary_df[dictionary_df.columns[col]]:
        if entry in str(row):
            boolean.append(True)
        else:
            boolean.append(False)
    if True in boolean:
        is_long = pd.Series(boolean)
        data1 = dictionary_df[is_long][Col4].values[0]
        data2 = dictionary_df[is_long][Col5].values[0]
        data3 = dictionary_df[is_long][Col6].values[0]

所以我要做的是搜索所有3列，并列出真-假，然后如果它找到了结果，它会从第4-6列获取我需要的数据。这是可行的，但它只返回第4、5、6列中最后一个找到的值，而不是同时返回这两个值。在这种情况下，我需要的是能够让data1成为一个包含（Data00，Data02）的列表。我不知道如何同时包含这两个（或多个）结果。

对于矢量化解决方案，您可以尝试：

entry=123467
res=df.loc[
df.iloc[：，：3].astype（str）\
.replace（f“*{entry}.*”，entry，regex=True）\
.eq（条目）.any（轴=1），df.列[3:]\
]

对于前3列-I regex将所有与

条目

匹配的内容都替换为仅

条目

（因此，如果单元格内有

条目

，我将整个单元格替换为仅

条目

），然后我过滤掉其中至少有一个

条目

的行（您可以以向量化的

numpy

方式执行

arr==x

，但不能在arr中执行

x），最后只返回这些行中最新的3列
产出：
     Col4    Col5    Col6
0  Data00  Info00  Here00
2  Data02  Info03  Here03
3  Dataxy  Infoxy  Here04

Col4 Col5 Col6
0 Data00 Info00 Here00
2数据02信息02此处02

要根据您的要求转换输出，请执行以下操作：
data1、data2、data3=res.T.values

哪些产出：
打印（数据1、数据2、数据3）
['Data00''Data02']['Info00''Info02']['Here00''Here02']

对于矢量化解决方案，您可以尝试：

entry=123467
res=df.loc[
df.iloc[：，：3].astype（str）\
.replace（f“*{entry}.*”，entry，regex=True）\
.eq（条目）.any（轴=1），df.列[3:]\
]

对于前3列-I regex将所有与

条目

匹配的内容都替换为仅

条目

（因此，如果单元格内有

条目

，我将整个单元格替换为仅

条目

），然后我过滤掉其中至少有一个

条目

的行（您可以以向量化的

numpy

方式执行

arr==x

，但不能在arr中执行

x），最后只返回这些行中最新的3列
产出：
     Col4    Col5    Col6
0  Data00  Info00  Here00
2  Data02  Info03  Here03
3  Dataxy  Infoxy  Here04

Col4 Col5 Col6
0 Data00 Info00 Here00
2数据02信息02此处02

要根据您的要求转换输出，请执行以下操作：
data1、data2、data3=res.T.values

哪些产出：
打印（数据1、数据2、数据3）
['Data00''Data02']['Info00''Info02']['Here00''Here02']

另一种方法可能是查询，但这是我第一次设法开始工作。结果在数据框中，希望这能满足您的需要。我对我所做的做了评论，它基本上是将所有索引放入一个列表，然后只得到唯一的值

不知道性能方面会发生什么，但这应该适合您

作为pd进口熊猫

data = {
    'Col1' : [123467, 976443, 123467,976443,976443],
    'Col2' : [123231, 122400, 120041, 3647677, 23485],
    'Col3' : [134521, 141001, 3456123, 123467, 12376],
    'Col4' : ['Data00', 'Data01','Data02', 'Dataxy', 'Datablah'],
    'Col5' : ['Info00', 'Info02','Info03', 'Infoxy', 'Infoxz'],
    'Col6' : ['Here00','Here02','Here03', 'Here04', 'Here05'] 
}

entry = 123467
dictionary_df = pd.DataFrame(data)
# concatenate lists where col1, col2 or col3 meet condition. 
indices = dictionary_df[dictionary_df['Col1'] == entry].index.tolist() + dictionary_df[dictionary_df['Col2'] == entry].index.tolist() + dictionary_df[dictionary_df['Col3'] == entry].index.tolist()
# might be duplicates. therefore make set of list (== get rid of doubles. Indices are uniqe, therefore we do not have loss of info)
result = set(indices)
# now get dataframe of col4-6 with only the rows that meet the indices
result = dictionary_df[['Col4', 'Col5', 'Col6']].loc[result]
print(result)

产出：

     Col4    Col5    Col6
0  Data00  Info00  Here00
2  Data02  Info03  Here03
3  Dataxy  Infoxy  Here04

将此添加到上面的代码中

result_aslist_example = result['Col4'].values.tolist()
print(result_aslist_example)

输出还包含：

['Data00', 'Data02', 'Dataxy']

另一种方法可能是查询，但这是我第一次设法开始工作。结果是在一个数据框架中，希望这能满足您的需要。我对我所做的做了评论，它基本上是将所有索引放入一个列表，然后只得到唯一的值

不知道性能方面会发生什么，但这应该适合您

作为pd进口熊猫

data = {
    'Col1' : [123467, 976443, 123467,976443,976443],
    'Col2' : [123231, 122400, 120041, 3647677, 23485],
    'Col3' : [134521, 141001, 3456123, 123467, 12376],
    'Col4' : ['Data00', 'Data01','Data02', 'Dataxy', 'Datablah'],
    'Col5' : ['Info00', 'Info02','Info03', 'Infoxy', 'Infoxz'],
    'Col6' : ['Here00','Here02','Here03', 'Here04', 'Here05'] 
}

entry = 123467
dictionary_df = pd.DataFrame(data)
# concatenate lists where col1, col2 or col3 meet condition. 
indices = dictionary_df[dictionary_df['Col1'] == entry].index.tolist() + dictionary_df[dictionary_df['Col2'] == entry].index.tolist() + dictionary_df[dictionary_df['Col3'] == entry].index.tolist()
# might be duplicates. therefore make set of list (== get rid of doubles. Indices are uniqe, therefore we do not have loss of info)
result = set(indices)
# now get dataframe of col4-6 with only the rows that meet the indices
result = dictionary_df[['Col4', 'Col5', 'Col6']].loc[result]
print(result)

产出：

     Col4    Col5    Col6
0  Data00  Info00  Here00
2  Data02  Info03  Here03
3  Dataxy  Infoxy  Here04

将此添加到上面的代码中

result_aslist_example = result['Col4'].values.tolist()
print(result_aslist_example)

输出还包含：

['Data00', 'Data02', 'Dataxy']

我很难理解这一点……所以你用这个？res替换我的大部分代码会有结果吗？我理解第一条df.iloc行，但我不理解replace或.eq行在做什么。是的，因此本质上你不能在df中对

a进行向量化，以验证df
的每个单元格是否它里面有a
。我现在做的是在每个有entry
的单元格中替换为entry
，所以xyzabc
，123
或
将变成entry
，然后你可以执行df==a
并检查，any（axis=1）
这将基本上检查整行，是否至少有一个True
。您应该看到性能上的差异，因为它是矢量化的解决方案，所以这似乎是我想要的方式，但我需要从col4、col5和col6中获取数据。我需要数据为col4=data00或者如果有多个值col4=（data00、data02、data55）。我正在努力从pd df中提取信息。这是col1-3上的条目
过滤器的顶部，对吗？你能在你的问题中更详细地澄清一下吗？非常感谢。你的回答帮了我很多忙。我实际上有很多列，但它们没有顺序。我正在从6列中搜索匹配项，然后是pu从8列中提取数据，包括我搜索过的一些列中的数据。但是在您的帮助和示例下，我能够得到我所需要的！非常感谢！我很难理解这一点…所以您将我的大部分代码替换为这个？res将是结果？我理解第一行df.iloc，但我不理解了解replace或.eq行在做什么。是的，因此本质上你不能用一种方式对df

中的

a进行矢量化，即验证df

的每个单元格中是否有

。相反，我要做的是在每个单元格中有

条目的地方用条目
来替换123
或
将变成条目
，然后您可以执行df==a
并检查。任何（axis=1）
都将检查整行，是否至少有一个为True
。您应该看到性能上的差异，因为它是v