Python 字典映射返回Nan_Python_Python 3.x_Pandas_Sklearn Pandas

Python 字典映射返回Nan

python python-3.x pandas

Python 字典映射返回Nan,python,python-3.x,pandas,sklearn-pandas,Python,Python 3.x,Pandas,Sklearn Pandas,我有两个数据集，我想从表a的两列创建一个字典，然后在表B中创建一个新的列，它的性能类似于excel vloopup 表A Date Wk of Year ...Other columns 2020-1-1 1 2020-1-2 1 2020-1-10 2 2020-1-11 2 表B Shop Date Sales ...Other columns A 2020-1-1 100 B 20

我有两个数据集，我想从表a的两列创建一个字典，然后在表B中创建一个新的列，它的性能类似于excel vloopup

表A

Date       Wk of Year   ...Other columns
2020-1-1       1
2020-1-2       1
2020-1-10      2
2020-1-11      2

表B

Shop   Date        Sales   ...Other columns
A      2020-1-1    100
B      2020-1-1    100
C      2020-1-1    100
A      2020-1-10   100

Expected Result
Shop   Date        Sales   Wk of Year
A      2020-1-1    100         1
B      2020-1-1    100         1
C      2020-1-1    100         1
A      2020-1-10   100         2

代码我从表A创建字典

name = pd.to_datetime(Table A['date'])
wk =   Table A['Wk of Year']
dict= dict(zip(name,wk))

Table B['wk'] = pd.to_datetime(Table B ['Date'].map(dict)

实际结果：

Shop   Date        Sales   Wk of Year
A      2020-1-1    100         NaT
B      2020-1-1    100         NaT
C      2020-1-1    100         Nat
A      2020-1-10   100         Nat

尝试pandas

merge

函数，并在arg上传递

——它是您想要加入数据集的列
Table_merged = pd.merge(Table_B, Table_A['Date', 'Wk_of_Year'], on='Date')

它将创建您期望的数据集：
         Date  Sales Shop  Wk_of_Year
0  2020-01-01    100    A           1
1  2020-01-01    100    B           1
2  2020-01-01    100    C           1
3  2020-01-10    100    D           2

但是，如果您仍然想使用您的策略，请使用pandasinsert
功能：
date_wk_dct = {key: value for key, value in Table_A[['Date', 'Wk_of_Year']].get_values()}
Table_B.insert(3, "Wk_of_Year", [date_wk_dct[v] for v in iter(Table_B['Date'].get_values())], True)

这将在现有的Table_B数据集中插入新列，结果相同：
         Date  Sales Shop  Wk_of_Year
0  2020-01-01    100    A           1
1  2020-01-01    100    B           1
2  2020-01-01    100    C           1
3  2020-01-10    100    D           2

首先，不要在dict=dict（zip（name，wk））
中使用dict
作为名称，因为你践踏了内置名称。其次，你不能只对一列整数调用pd.to\u datetime
；这就是为什么你会得到NaT
。为什么您要使用pd.to_datetime
？您能修改您的示例使其成为有效的代码吗？很多语法都不正确，我可以保证pd.to\u datetime
不会返回NaT
，而是会抛出一个错误，除非您指定errors='concurve'
，我看不到这一点。当示例不起作用时，很难提供解决方案。