AttributeError:只能使用带字符串值的.str访问器，该访问器在pandas（Python）中使用np.object\dtype_Python_Pandas_Dataframe

AttributeError:只能使用带字符串值的.str访问器，该访问器在pandas（Python）中使用np.object\dtype

python pandas dataframe

AttributeError:只能使用带字符串值的.str访问器，该访问器在pandas（Python）中使用np.object\dtype,python,pandas,dataframe,Python,Pandas,Dataframe,我正在操作一个JSON文件，从中运行此代码以获取以下数据帧： import pandas as pd topics = df.set_index('username').popular_board_data.str.extractall(r'name":"([^,]*)') total = df.set_index('username').popular_board_data.str.extractall(r'totalCount\":([^,}]*)') data = [] for use

我正在操作一个JSON文件，从中运行此代码以获取以下数据帧：

import pandas as pd

topics = df.set_index('username').popular_board_data.str.extractall(r'name":"([^,]*)')
total = df.set_index('username').popular_board_data.str.extractall(r'totalCount\":([^,}]*)')

data = []
for username in df.username.unique():
for topic in zip(topics[0][username], total[0][username]):
    data.append([username, topic])

df_topic = pd.DataFrame(data, columns='username,topic'.split(','))

    username        topic
0     lukl    (Hardware", 80)
1     lukl    (Marketplace", 31)
2     lukl    (Atari 5200", 27)
3     lukl    (Atari 8-Bit Computers", 9)
4     lukl    (Modern Gaming", 3)

现在，我需要将“主题”列中的信息拆分为两个不同的列：

这是预期的结果：

    username        topic          _topic       _total
0     lukl    (Hardware", 80)      Hardware     80
1     lukl    (Marketplace", 31)   Marketplace  31
2     lukl    (Atari 5200", 27)    Atari 5200   27
3     lukl    (Atari 8", 9)        Atari 8      9
4     lukl    (Modern", 3)         Modern       3

我想用这段代码做这件事：

df_top = df_topic.copy()
df_top['_topic'] = df_topic['topic'].str.split('(').str[1].str.split('",').str[0]
df_top['_total'] = df_topic['topic'].str.split('",').str[1].str.split(')').str[0]
df_top

但我得到了一个错误：

AttributeError:只能使用带字符串值的.str访问器，它在pandas中使用np.object dtype

我认为有元组，所以只能使用

DataFrame

构造函数：

df_topic[['_topic', '_total']]=pd.DataFrame(df_topic['topic'].values.tolist(), 
                                index=df_topic.index)

更好的解决方案是使用您以前的答案数据和：

我将主题作为字符串，如果不是字符串，则将其转换为字符串

df = pd.DataFrame(data={"username":['luk1','luk1','luk1'],
                  'topic':[ '(Hardware, 80)','(Marketplace, 31)', '(Atari 5200, 27)']})
df['_topic'] = df['topic'].apply(lambda x:str(x).split(",")[0][1:])
df['_total'] = df['topic'].apply(lambda x:str(x).split(",")[1][:-1])

您可以使用以下正则表达式：

df['_topic'] = df['topic'].str.extract(r'([a-zA-Z]+)')
df['_total'] = df['topic'].str.extract(r'(\d+)')

  username                        topic       _topic _total
0     lukl              (Hardware", 80)     Hardware     80
1     lukl           (Marketplace", 31)  Marketplace     31
2     lukl            (Atari 5200", 27)        Atari   5200
3     lukl  (Atari 8-Bit Computers", 9)        Atari      8
4     lukl          (Modern Gaming", 3)       Modern      3

您能否将问题

打印（df.head（））

添加到问题？因为这里似乎应该是更好的解决方案。

df = pd.DataFrame(data={"username":['luk1','luk1','luk1'],
                  'topic':[ '(Hardware, 80)','(Marketplace, 31)', '(Atari 5200, 27)']})
df['_topic'] = df['topic'].apply(lambda x:str(x).split(",")[0][1:])
df['_total'] = df['topic'].apply(lambda x:str(x).split(",")[1][:-1])

   username         topic      _topic   _total
0   luk1    (Hardware, 80)      Hardware    80
1   luk1    (Marketplace, 31)   Marketplace 31
2   luk1    (Atari 5200, 27)    Atari 5200  27

df['_topic'] = df['topic'].str.extract(r'([a-zA-Z]+)')
df['_total'] = df['topic'].str.extract(r'(\d+)')

  username                        topic       _topic _total
0     lukl              (Hardware", 80)     Hardware     80
1     lukl           (Marketplace", 31)  Marketplace     31
2     lukl            (Atari 5200", 27)        Atari   5200
3     lukl  (Atari 8-Bit Computers", 9)        Atari      8
4     lukl          (Modern Gaming", 3)       Modern      3