python merge返回ValueError:无法将DataFrame与类型为<;的实例合并;类别';numpy.ndarray和#x27&燃气轮机;
我试图读入两个文件,一个文本文件和一个csv文件,解析它们并写出一个包含两张工作表的excel文件 我对两个文件的合并返回了一个我不理解的ValueError 这是我的代码:python merge返回ValueError:无法将DataFrame与类型为<;的实例合并;类别';numpy.ndarray和#x27&燃气轮机;,python,pandas,merge,Python,Pandas,Merge,我试图读入两个文件,一个文本文件和一个csv文件,解析它们并写出一个包含两张工作表的excel文件 我对两个文件的合并返回了一个我不理解的ValueError 这是我的代码: jobs = pd.read_csv(r'test01.txt', delim_whitespace=True, skiprows=[0, 2]) cmdb = pd.read_csv(r'cmdb.csv', encoding='ansi') # new data frame with split value colu
jobs = pd.read_csv(r'test01.txt', delim_whitespace=True, skiprows=[0, 2])
cmdb = pd.read_csv(r'cmdb.csv', encoding='ansi')
# new data frame with split value columns
new = jobs["Job_Name"].str.split("_", n = 2, expand = True)
# making seperate App Code column from new data frame
jobs["appCode"]= new[1]
# Get unique app codes and sort them
apps = new[1].unique()
apps.sort()
merged = pd.merge(apps,cmdb, left_on="appCode", right_on="application", how='inner')
这就是我收到的错误:
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 538, in __init__
'type {left}'.format(left=type(left)))
ValueError: can not merge DataFrame with instance of type <class 'numpy.ndarray'>
我正在创建的cmdb csv文件如下所示-
Job_Name Last Start Last End
________________________________________________________________ ____________________ ____________________
PRD_ABC_CAT_QUARTERLYJOBS ----- -----
PRD_UNC_DOG_QUARTERLYZZZZZZZZZZZREPORT ----- -----
PRD_ZYX_DOG_ABCDEF_WHATEVER_BATCH 09/15/2016 10:00:00 09/15/2016 10:00:00
PRD_ZYX_DOG_ABCDEF_WHATEVER_TRANSMIT_TW 09/13/2016 10:00:11 09/13/2016 10:00:12
PRD_ZYX_DOG_ABCDEF_WHATEVER_ARCHIVE 09/13/2016 10:00:13 09/13/2016 10:00:13
PRD_ZYX_DD_ABCDEF_AFFEE_BATCH 12/07/2016 20:00:02 12/07/2016 20:00:02
PRD_ZYX_DD_ABCDEF_AFFEE_BATCH2 ----- 12/07/2016 20:06:56
application,name,status
ABC,GOOFY,LIVE
DEF,PLUTO,UNKNOWN
RRR,DAISY,HOT
UNC,DONALD,DEAD
ZYX,ALVIN,SLEEPING
unique
返回方法
如果输入是索引,则返回是索引
如果输入是分类数据类型,则返回的是分类数据类型
如果输入为系列/ndarray,则返回为ndarray
因此,应用程序
不是数据帧
示例用法:
如果
然后
向我们展示(部分)您的数据肯定会有所帮助。太好了!非常感谢。
apps = new[1].unique()
jobs = pd.DataFrame(
{ 'Job_Name': [ 'PRD_ZYX_DOG_ABCDEF_WHATEVER_BATCH', 'PRD_ZYX_DOG_ABCDEF_WHATEVER_TRANSMIT_TW',
'PRD_ZYX_DOG_ABCDEF_WHATEVER_ARCHIVE', 'PRD_ZYX_DD_ABCDEF_AFFEE_BATCH'],
'Last Start': [ 1,2,3,4], 'Last End': [10,20,30,40]})
cmdb = pd.DataFrame({
'application': ['ABC','DEF','RRR', 'UNC','ZYX'],
'name': ['GOOFY','PLUTO','DAISY','DONALD','ALVIN'],
'status': ['LIVE','UNKNOWN','HOT', 'DEAD', 'SLEEPING']})
# Create appCode
jobs["appCode"] = jobs["Job_Name"].str.split("_", n = 2, expand=True)[1]
# If you want to merge the jobss and cmdb then you can use
merged = pd.merge(jobs,cmdb, left_on="appCode", right_on="application", how='inner')
# Of if you want to filter the contents of cmdb with only the
# application values from appCode of jobs you can use
filtered = cmdb[cmdb['application'].isin(jobs['appCode'])]