Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/327.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何基于通配符加入数据帧?_Python_Regex_Pandas_Dataframe_Wildcard - Fatal编程技术网

Python 如何基于通配符加入数据帧?

Python 如何基于通配符加入数据帧?,python,regex,pandas,dataframe,wildcard,Python,Regex,Pandas,Dataframe,Wildcard,我有两个数据帧df和df2,我想将它们与*合并为通配符 import pandas as pd data = [[".",".",1],["AB.","B.",3],["B.",".",2]] data2 = [["A","B","1"],["ABC","BC",4],["B","A",2]] columns = ["Type1","Type2","Value"] df = pd.DataFrame(data,columns=columns) df2 = pd.DataFrame(data2,c

我有两个数据帧df和df2,我想将它们与*合并为通配符

import pandas as pd
data = [[".",".",1],["AB.","B.",3],["B.",".",2]]
data2 = [["A","B","1"],["ABC","BC",4],["B","A",2]]
columns = ["Type1","Type2","Value"]
df = pd.DataFrame(data,columns=columns)
df2 = pd.DataFrame(data2,columns=columns)
print(df)
print(df2)
  Type1 Type2  Value
0     *     *      1
1   AB*    B*      3
2    B*     *      2
  Type1 Type2 Value
0     A     B     1
1   ABC    BC     4
2     B     A     2
通常情况下,df2的第二行应与第1行和第2行匹配。 而df2中的第0行应仅与df1的第一行匹配。 不知怎的,我想得到这样的东西

df2.merge(df,how='left',on=["Type1","Type2"])
但是这里的结果与任何东西都不匹配

这是我想要得到的结果

data3 = [["A","B","1","1"],["ABC","BC",4,1],["ABC","BC",4,3],["B","A",2,1],["B","A",2,2]]
columns3 = ["Type1","Type2","Value_x","Value_y"]
results = pd.DataFrame(data3,columns=columns3)
print(results)
  Type1 Type2 Value_x Value_y
0     A     B       1       1
1   ABC    BC       4       1
2   ABC    BC       4       3
3     B     A       2       1
4     B     A       2       2

请注意,df2表实际上有100多万行,因此出于效率原因,我无法进行循环。

最后我决定使用下面的代码。这会将数据帧传输到SQLite数据库中,然后执行连接,最后将其带回另一个数据帧。这不是最优的,但它是有效的

import sqlite3
conn = sqlite3.connect(':memory:')
df.to_sql('df', conn, index=False)
df2.to_sql('df2', conn, index=False)
query = """
SELECT [df2].[Type1],
       [df2].[Type2],
       [df2].[value],
       [df].[value]
FROM   ([df]
        LEFT OUTER JOIN [df2]
                     ON [df].[type1] LIKE [df2].[type1]
                     AND [df].[type2]   LIKE [df2].[type2])
"""
df3 = pd.read_sql_query(query, conn)
conn.close()

最后,我决定使用下面的代码。这会将数据帧传输到SQLite数据库中,然后执行连接,最后将其带回另一个数据帧。这不是最优的,但它是有效的

import sqlite3
conn = sqlite3.connect(':memory:')
df.to_sql('df', conn, index=False)
df2.to_sql('df2', conn, index=False)
query = """
SELECT [df2].[Type1],
       [df2].[Type2],
       [df2].[value],
       [df].[value]
FROM   ([df]
        LEFT OUTER JOIN [df2]
                     ON [df].[type1] LIKE [df2].[type1]
                     AND [df].[type2]   LIKE [df2].[type2])
"""
df3 = pd.read_sql_query(query, conn)
conn.close()

你的预期结果是什么?好的,我刚做了:)你的预期结果是什么?好的,我刚做了:)