python正在堆叠标头中缺少值的数据_Python

python正在堆叠标头中缺少值的数据

python

python正在堆叠标头中缺少值的数据,python,Python,我有从csv文件导入的数据，实际上有更多的列和周期，但这是一个具有代表性的片段： Export date 2020-10-10 Record #3 Record #2 Record #1 Cycle #5 Cycle #4 Cycle #3 time ( min.) Parameter1 So

我有从csv文件导入的数据，实际上有更多的列和周期，但这是一个具有代表性的片段：

Export date 2020-10-10                                  
    Record #3           Record #2           Record #1       
    Cycle #5            Cycle #4            Cycle #3        
time ( min.)    Parameter1  Something2  Whatever3   Parameter1  Something2  Whatever3   Parameter1  Something2  Whatever3
0   0.0390625   9.89619 0.853909    14.409  10.1961 0.859037    14.4676 10.0274 0.832598
1   0.0390625   9.53452 0.949844    14.4096 10.3034 1.224   14.4676 10.0323 1.20403
2   0.0390625   9.8956  1.47227 14.4097 10.6586 1.14486 14.4676 10.4936 1.12747
3   0.0390625   10.7829 1.44412 14.4097 10.9185 1.20247 14.5116 10.6892 1.12459

数据的顶部包含表中不需要的行（

导出日期

）。我希望对数据进行堆栈，以便有

Cycle

和

Record

列。问题在于，这些值仅在每个周期的第一列数据之上。例如，Cycle5有三列数据，然后Cycle4有三列数据等等

这是输出的外观：

我没走多远：

df = pd.read_csv('cycles.csv')  
#Fill the names of cycles to the right
df.ffill(axis = 1, inplace = True)

#Not sure this is needed, it might make it easier to melt/stack
df.iloc[0,0] = "time ( min.)"
df.iloc[1,0] = "time ( min.)"

谢谢你的想法和帮助

您需要解决以下几个问题：

首先阅读所有需要的信息：除非单独读取所有信息，否则无法执行此操作：

import pandas as pd
from io import StringIO
string = open('SO.csv').read()
records = [i.split('#')[1].strip() for i in string.split('\n')[1].split(',') if '#' in i]
cycles  = [i.split('#')[1].strip() for i in string.split('\n')[2].split(',') if '#' in i]
data = pd.read_csv(StringIO(string), sep=',', header=3).dropna(how = 'any')

重命名列，使其遵循模式：构建一个循环以提取每个记录和周期的记录：最后，将它们全部合并：这导致：

time ( min.)    Parameter1  Something2  Whatever3   Cycle   Record
0   0.0 0.039062    9.89619 0.853909    5   3
1   1.0 0.039062    9.53452 0.949844    5   3
2   2.0 0.039062    9.89560 1.472270    5   3
3   3.0 0.039062    10.78290    1.444120    5   3
0   0.0 14.409000   10.19610    0.859037    4   2
1   1.0 14.409600   10.30340    1.224000    4   2
2   2.0 14.409700   10.65860    1.144860    4   2
3   3.0 14.409700   10.91850    1.202470    4   2
0   0.0 14.467600   10.02740    0.832598    3   1
1   1.0 14.467600   10.03230    1.204030    3   1
2   2.0 14.467600   10.49360    1.127470    3   1
3   3.0 14.511600   10.68920    1.124590    3   1

用简单的步骤解决问题不仅有助于你解决这个问题，而且在其他情况下也会有帮助。只要想清楚你需要做什么，就可以分步走下去

您需要解决以下几个问题：

首先阅读所有需要的信息：除非单独读取所有信息，否则无法执行此操作：

import pandas as pd
from io import StringIO
string = open('SO.csv').read()
records = [i.split('#')[1].strip() for i in string.split('\n')[1].split(',') if '#' in i]
cycles  = [i.split('#')[1].strip() for i in string.split('\n')[2].split(',') if '#' in i]
data = pd.read_csv(StringIO(string), sep=',', header=3).dropna(how = 'any')

重命名列，使其遵循模式：构建一个循环以提取每个记录和周期的记录：最后，将它们全部合并：这导致：

time ( min.)    Parameter1  Something2  Whatever3   Cycle   Record
0   0.0 0.039062    9.89619 0.853909    5   3
1   1.0 0.039062    9.53452 0.949844    5   3
2   2.0 0.039062    9.89560 1.472270    5   3
3   3.0 0.039062    10.78290    1.444120    5   3
0   0.0 14.409000   10.19610    0.859037    4   2
1   1.0 14.409600   10.30340    1.224000    4   2
2   2.0 14.409700   10.65860    1.144860    4   2
3   3.0 14.409700   10.91850    1.202470    4   2
0   0.0 14.467600   10.02740    0.832598    3   1
1   1.0 14.467600   10.03230    1.204030    3   1
2   2.0 14.467600   10.49360    1.127470    3   1
3   3.0 14.511600   10.68920    1.124590    3   1

用简单的步骤解决问题不仅有助于你解决这个问题，而且在其他情况下也会有帮助。只要想清楚你需要做什么，就可以分步走下去

到目前为止，你的代码是什么？我已经在帖子中添加了我到目前为止的代码（恐怕不多…），你到目前为止的代码是什么？我已经在帖子中添加了我到目前为止的代码（恐怕不多…），谢谢@User5！这很有帮助！当执行建议的代码时，我在第行得到一个错误：df[['Cycle'，Record']]=cycles[rdx]，records[rdx]错误声明：KeyError:“[Index（['Cycle'，Record']，dtype='object'）]都不在[columns].”中。请尝试分两步执行：

df['Cycle']=cycles[rdx]

df['Record']=records[rdx]

谢谢！它修复了它，你的回答真的很有帮助！！谢谢@User5！这很有帮助！当执行建议的代码时，我在第行得到一个错误：df[['Cycle'，Record']]=cycles[rdx]，records[rdx]错误声明：KeyError:“[Index（['Cycle'，Record']，dtype='object'）]都不在[columns].”中。请尝试分两步执行：

df['Cycle']=cycles[rdx]

df['Record']=records[rdx]

谢谢！它修复了它，你的回答真的很有帮助！！

time ( min.)    Parameter1  Something2  Whatever3   Cycle   Record
0   0.0 0.039062    9.89619 0.853909    5   3
1   1.0 0.039062    9.53452 0.949844    5   3
2   2.0 0.039062    9.89560 1.472270    5   3
3   3.0 0.039062    10.78290    1.444120    5   3
0   0.0 14.409000   10.19610    0.859037    4   2
1   1.0 14.409600   10.30340    1.224000    4   2
2   2.0 14.409700   10.65860    1.144860    4   2
3   3.0 14.409700   10.91850    1.202470    4   2
0   0.0 14.467600   10.02740    0.832598    3   1
1   1.0 14.467600   10.03230    1.204030    3   1
2   2.0 14.467600   10.49360    1.127470    3   1
3   3.0 14.511600   10.68920    1.124590    3   1