Python 熊猫:用缺少的分隔符分隔两列

Python 熊猫:用缺少的分隔符分隔两列,python,pandas,formatting,time-series,separator,Python,Pandas,Formatting,Time Series,Separator,我有如下数据: 00052600150.00942615 00052601000.01014910 00052601050.02709672 00052601100.11454732 00052601150.23151254 00052601200.36262522 00052601250.66432348 00052601301.07723763 00052601351.26019487 00052601401.20568581 前10位数字表示时间步长YYMMDDhhmm,后跟一个数字 它

我有如下数据:

00052600150.00942615
00052601000.01014910
00052601050.02709672
00052601100.11454732
00052601150.23151254
00052601200.36262522
00052601250.66432348
00052601301.07723763
00052601351.26019487
00052601401.20568581
前10位数字表示时间步长YYMMDDhhmm,后跟一个数字

它应该是0005260010,0.00799872,其中第一个块是时间步,第二个块是测量值

我试着用pandas读取数据,并将其转换为str,但我失去了前导零?有没有办法用数字分隔浮点数


问候语

带有熊猫的正则表达式可以在不使用delimeters的情况下拆分您的列

#示例数据
df=pd.DataFrame({'A':[
'00052600150.00942615',
'00052601000.01014910',
'00052601050.02709672',
'00052601100.11454732',
'00052601150.23151254',
'00052601200.36262522',
'00052601250.66432348',
'00052601301.07723763',
'00052601351.26019487',
'00052601401.20568581',
]})
df3=df['A'].str.extract(
r'(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})(\d\.\d*),
expand=True)
df3.columns=[“年”、“月”、“日”、“小时”、“分钟”、“读数”]
打印(df3)
输出

  Year Month Day Hour Minute     Reading
0   00    05  26   00     15  0.00942615
1   00    05  26   01     00  0.01014910
2   00    05  26   01     05  0.02709672
3   00    05  26   01     10  0.11454732
4   00    05  26   01     15  0.23151254
5   00    05  26   01     20  0.36262522
6   00    05  26   01     25  0.66432348
7   00    05  26   01     30  1.07723763
8   00    05  26   01     35  1.26019487
9   00    05  26   01     40  1.20568581

您可以将该列读取为
str
,并按位置拆分您的值

df = pd.read_csv('yourfile.csv', header=None, dtype='str', names=['col1'])
df['time'] = pd.to_datetime(df.col1.str[:10], unit='s')
df['value'] = (df.col1.str[10:]).astype('float')
df
输出:


timestep-你是指时间戳?如果是这样的话-这些是固定长度的,所以请注意…嗨,不varies@Mainzelmann-请在示例数据中包含已知的变化,以避免混淆。时间步长的长度(如5分钟/10分钟等)有所不同,但始终为10位数。我会把这一点添加到问题中sorryoh dude谢谢,我没有以dtype=str的形式读取数据,而是在读取数据后重新格式化了数据,并丢失了数字格式。这样,拆分工作就没有问题了!
                   col1                time     value
0  00052600150.00942615 1970-03-02 21:06:55  0.009426
1  00052601000.01014910 1970-03-02 21:08:20  0.010149
2  00052601050.02709672 1970-03-02 21:08:25  0.027097
3  00052601100.11454732 1970-03-02 21:08:30  0.114547
4  00052601150.23151254 1970-03-02 21:08:35  0.231513
5  00052601200.36262522 1970-03-02 21:08:40  0.362625
6  00052601250.66432348 1970-03-02 21:08:45  0.664323
7  00052601301.07723763 1970-03-02 21:08:50  1.077238
8  00052601351.26019487 1970-03-02 21:08:55  1.260195
9  00052601401.20568581 1970-03-02 21:09:00  1.205686