Python 使用另一个具有相应替换项的pandas df替换pandas列中的值
我有一个名为Python 使用另一个具有相应替换项的pandas df替换pandas列中的值,python,pandas,Python,Pandas,我有一个名为库存,其中有一列包含零件号(字母数字)。其中一些零件号已被取代,我有另一个df名为替换为,包含两列,“旧零件号”和“新零件号”。 例如: 库存具有如下值: * 123AAA * 123BBB * 123CCC ...... 替换为具有如下值 **oldPartnumbers** ..... **newPartnumbers** * 123AAA ............ 123ABC * 123CCC .
库存
,其中有一列包含零件号
(字母数字)。其中一些零件号已被取代,我有另一个df名为替换为,包含两列,“旧零件号”
和“新零件号”
。
例如:
库存具有如下值:
* 123AAA
* 123BBB
* 123CCC
......
替换为具有如下值
**oldPartnumbers** ..... **newPartnumbers**
* 123AAA ............ 123ABC
* 123CCC ........... 123DEF
所以,我需要用新的数字替换库存中的相应值。更换后的库存将如下所示:
* 123ABC
* 123BBB
* 123DEF
import pandas as pd
df1 = pd.DataFrame([[1,3],[5,4],[6,7]], columns = ['PN','name'])
df2 = pd.DataFrame([[2,22],[3,33],[4,44],[5,55]], columns = ['oldname','newname'])
在python中有没有一种简单的方法可以做到这一点?谢谢 设置
考虑数据帧库存
和替换为
inventory = pd.DataFrame(dict(Partnumbers=['123AAA', '123BBB', '123CCC']))
replace_with = pd.DataFrame(dict(
oldPartnumbers=['123AAA', '123BBB', '123CCC'],
newPartnumbers=['123ABC', '123DEF', '123GHI']
))
选项1
map
选项2
更换
设置
考虑数据帧库存
和替换为
inventory = pd.DataFrame(dict(Partnumbers=['123AAA', '123BBB', '123CCC']))
replace_with = pd.DataFrame(dict(
oldPartnumbers=['123AAA', '123BBB', '123CCC'],
newPartnumbers=['123ABC', '123DEF', '123GHI']
))
选项1
map
选项2
更换
假设您有2个df,如下所示:
* 123ABC
* 123BBB
* 123DEF
import pandas as pd
df1 = pd.DataFrame([[1,3],[5,4],[6,7]], columns = ['PN','name'])
df2 = pd.DataFrame([[2,22],[3,33],[4,44],[5,55]], columns = ['oldname','newname'])
df1:
df2:
在它们之间运行左连接:
temp = df1.merge(df2,'left',left_on='name',right_on='oldname')
温度:
然后计算新的名称
列并替换它:
df1['name'] = temp.apply(lambda row: row['newname'] if pd.notnull(row['newname']) else row['name'], axis=1)
df1:
或者,作为一个班轮:
df1['name'] = df1.merge(df2,'left',left_on='name',right_on='oldname').apply(lambda row: row['newname'] if pd.notnull(row['newname']) else row['name'], axis=1)
假设您有2个df,如下所示:
* 123ABC
* 123BBB
* 123DEF
import pandas as pd
df1 = pd.DataFrame([[1,3],[5,4],[6,7]], columns = ['PN','name'])
df2 = pd.DataFrame([[2,22],[3,33],[4,44],[5,55]], columns = ['oldname','newname'])
df1:
df2:
在它们之间运行左连接:
temp = df1.merge(df2,'left',left_on='name',right_on='oldname')
温度:
然后计算新的名称
列并替换它:
df1['name'] = temp.apply(lambda row: row['newname'] if pd.notnull(row['newname']) else row['name'], axis=1)
df1:
或者,作为一个班轮:
df1['name'] = df1.merge(df2,'left',left_on='name',right_on='oldname').apply(lambda row: row['newname'] if pd.notnull(row['newname']) else row['name'], axis=1)
此解决方案相对较快-它使用pandas数据对齐和numpy“copyto”功能
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'partNumbers': ['123AAA', '123BBB', '123CCC', '123DDD']})
df2 = pd.DataFrame({'oldPartnumbers': ['123AAA', '123BBB', '123CCC'],
'newPartnumbers': ['123ABC', '123DEF', '123GHI']})
# assign index in each dataframe to original part number columns
# (faster than set_index method, but use set_index if original index must be preserved)
df1.index = df1.partNumbers
df2.index = df2.oldPartnumbers
# use pandas index data alignment
df1['updatedPartNumbers'] = df2.newPartnumbers
# use numpy to copy in old part num when a new part num is not found
np.copyto(df1.updatedPartNumbers.values,
df1.partNumbers.values,
where=pd.isnull(df1.updatedPartNumbers))
# reset index
df1.reset_index(drop=True, inplace=True)
df1:
此解决方案相对较快-它使用pandas数据对齐和numpy“copyto”功能
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'partNumbers': ['123AAA', '123BBB', '123CCC', '123DDD']})
df2 = pd.DataFrame({'oldPartnumbers': ['123AAA', '123BBB', '123CCC'],
'newPartnumbers': ['123ABC', '123DEF', '123GHI']})
# assign index in each dataframe to original part number columns
# (faster than set_index method, but use set_index if original index must be preserved)
df1.index = df1.partNumbers
df2.index = df2.oldPartnumbers
# use pandas index data alignment
df1['updatedPartNumbers'] = df2.newPartnumbers
# use numpy to copy in old part num when a new part num is not found
np.copyto(df1.updatedPartNumbers.values,
df1.partNumbers.values,
where=pd.isnull(df1.updatedPartNumbers))
# reset index
df1.reset_index(drop=True, inplace=True)
df1:
df['part\u numbers']=df['new\u part\u numbers']
足够吗?df['part\u numbers']=df['new\u part\u numbers']
足够吗?