Python 如何拆分单元格中的值以进行打印_Python_Pandas_Matplotlib

Python 如何拆分单元格中的值以进行打印

python pandas matplotlib

Python 如何拆分单元格中的值以进行打印,python,pandas,matplotlib,Python,Pandas,Matplotlib,我在尝试绘制当前数据帧时遇到问题。我当前在数据框的单元格中有如下值： (test, 5) “测试”应位于绘图的x轴上。数字“5”是“测试”发生次数的计数。因此，试验用钢筋的高度应为5 我的数据框看起来像这样（示例）：上述df中第一块地块的标题为“公司” 我想把每一行都画成一个子图，但我想不出如何把计数分开并画出来如果我只是绘制“公司”行，我想我会得到这样一个数据帧 test test2 test3 test4 company 5

我在尝试绘制当前数据帧时遇到问题。我当前在数据框的单元格中有如下值：

 (test, 5)

“测试”应位于绘图的x轴上。数字“5”是“测试”发生次数的计数。因此，试验用钢筋的高度应为5

我的数据框看起来像这样（示例）：

上述df中第一块地块的标题为“公司”

我想把每一行都画成一个子图，但我想不出如何把计数分开并画出来

如果我只是绘制“公司”行，我想我会得到这样一个数据帧

           test     test2    test3    test4
company     5        20       500       2

但是因为每行中的所有单词都不完全相同，如果我对所有行都这样做，那么我会有很多空值（我假设在我的绘图中是空条）。数据帧可能如下所示：

             test  test2  test3  test4  notest notest2 notest3 notest4
company       5    20      500     2     NONE   NONE    NONE    NONE 
residental  NONE   NONE   NONE   NONE     89    220     50       32

import pandas as pd

data = [
    [('test',5), ('test2', 20), ('test3', 500), ('test4', 2), 'company'],
    [('notest',89), ('notest2', 220), ('notest', 50), ('notest4', 32), 'residental']]
names = ['one', 'two', 'three', 'four', 'type']
df = pd.DataFrame(data=data, columns=names)


df = df.set_index('type') 
types = df.index.unique()
xnames = []
yvalues = []

for plot_type in types:
    xname = [values[0] for values in df.loc[plot_type].values]
    yvalue = [values[1] for values in df.loc[plot_type].values]

    xnames.append(xname)
    yvalues.append(yvalue)

谢谢。

看看这是否有助于您：

import pandas as pd
import numpy as np

dfs = []
rows = df.iterrows()
row = next(rows)
# Iterate over all of the rows
for row in df.iterrows():
    name, data = row
    # Create a column by the first item of each tuple
    row_df = pd.DataFrame({x[0]: [x[1]] for x in data if x is not None})
    row_df['type'] = name
    # Set the type as index
    dfs.append(row_df)
# Concatenate all
res_df = pd.concat(dfs).set_index('type')

输出：

            notest  notest2 notest4 test  test2 test3   test4
type                            
company      NaN      NaN     NaN     5    20     500     2
residental    50      220     32     NaN   NaN    NaN    NaN

xnames
[['test', 'test2', 'test3', 'test4'],
 ['notest', 'notest2', 'notest', 'notest4']]

yvalues
[[5, 20, 500, 2], [89, 220, 50, 32]]

我会将您的数据格式化为一个数组并使用它

大概是这样的：

             test  test2  test3  test4  notest notest2 notest3 notest4
company       5    20      500     2     NONE   NONE    NONE    NONE 
residental  NONE   NONE   NONE   NONE     89    220     50       32

import pandas as pd

data = [
    [('test',5), ('test2', 20), ('test3', 500), ('test4', 2), 'company'],
    [('notest',89), ('notest2', 220), ('notest', 50), ('notest4', 32), 'residental']]
names = ['one', 'two', 'three', 'four', 'type']
df = pd.DataFrame(data=data, columns=names)


df = df.set_index('type') 
types = df.index.unique()
xnames = []
yvalues = []

for plot_type in types:
    xname = [values[0] for values in df.loc[plot_type].values]
    yvalue = [values[1] for values in df.loc[plot_type].values]

    xnames.append(xname)
    yvalues.append(yvalue)

输出：

            notest  notest2 notest4 test  test2 test3   test4
type                            
company      NaN      NaN     NaN     5    20     500     2
residental    50      220     32     NaN   NaN    NaN    NaN

xnames
[['test', 'test2', 'test3', 'test4'],
 ['notest', 'notest2', 'notest', 'notest4']]

yvalues
[[5, 20, 500, 2], [89, 220, 50, 32]]

当我加载你的数据时，我收到了作为字符串的元组，所以我解析了它们。我编辑了这篇文章并删除了它，它现在应该适合你了。它是元组分析中留下的一个变量。在

pd.DataFrame（{x[0]：[x[1]]对于数据中的x}）中更改为data
。

源数据框中可能有

None

值（您没有在测试用例中提供它们）。添加了

如果x在dict理解中不是None

。这只是一个警告。它说的是你想读的有价值的东西，但并不影响你的结果。检查

res_df

的值。它根据您的测试用例工作。如果您希望我们克服其中的挑战，您应该上传更多代表性数据。