Python 如何将一列拆分为两列_Python_Pandas

Python 如何将一列拆分为两列

python pandas

Python 如何将一列拆分为两列,python,pandas,Python,Pandas,我有下一个数据帧 data=read_csv('enero.csv') data Fecha DirViento MagViento 0 2011/07/01 00:00 318 6.6 1 2011/07/01 00:15 342 5.5 2 2011/07/01 00:30 329 6.6 3

我有下一个数据帧

data=read_csv('enero.csv')
data

           Fecha           DirViento  MagViento  
0   2011/07/01  00:00        318        6.6      
1   2011/07/01  00:15        342        5.5        
2   2011/07/01  00:30        329        6.6        
3   2011/07/01  00:45        279        7.5        
4   2011/07/01  01:00        318        6.0        
5   2011/07/01  01:15        329        7.1        
6   2011/07/01  01:30        300        4.7        
7   2011/07/01  01:45        291        3.1

grouped.mean()                                                                         

Fecha     DirRes
2011/07/01 -3 
2011/07/02 -5
2011/07/03 -6

如何将列Fecha拆分为两列，例如，获取数据帧，如下所示：

      Fecha     Hora     DirViento  MagViento  
0   2011/07/01  00:00        318        6.6      
1   2011/07/01  00:15        342        5.5        
2   2011/07/01  00:30        329        6.6        
3   2011/07/01  00:45        279        7.5        
4   2011/07/01  01:00        318        6.0        
5   2011/07/01  01:15        329        7.1        
6   2011/07/01  01:30        300        4.7        
7   2011/07/01  01:45        291        3.1

grouped.mean()                                                                         

Fecha     DirRes
2011/07/01 -3 
2011/07/02 -5
2011/07/03 -6

我使用熊猫来读取数据

grouped.mean()                                                                         

Fecha     DirRes
2011/07/01 -3 
2011/07/02 -5
2011/07/03 -6

我试图从每月数据库中计算每日平均值，每15分钟记录一次每日数据。为此，请使用pandas并将获取数据帧的列：Date和Time分组，如下所示：

 Fecha Hora
 2011/07/01 00:00 -4.4
            00:15 -1.7
            00:30 -3.4
 2011/07/02 00:00 -4.5
            00:15 -4.2
            00:30 -7.6
 2011/07/03 00:00 -6.3
            00:15 -13.7
            00:30 -0.3

grouped.mean()                                                                         

Fecha     DirRes
2011/07/01 -3 
2011/07/02 -5
2011/07/03 -6

通过这种方式，我得到以下信息

grouped.mean()                                                                         

Fecha     DirRes
2011/07/01 -3 
2011/07/02 -5
2011/07/03 -6

这里是一个非常类似的问题，已经回答过，希望它是有益的。在本例中，您可以按空格分割Fecha中的内容，并构建字符串第二部分的列表。然后将内容添加到插入的新列中

grouped.mean()                                                                         

Fecha     DirRes
2011/07/01 -3 
2011/07/02 -5
2011/07/03 -6

import pandas as p
t = p.read_csv('test2.csv')

#store into a data frame
df = p.DataFrame(t)


#update the fecha col value and create new col hora
lista = [item.split(' ')[2] for item in df['Fecha']]
listb = p.Series([item.split(' ')[0] for item in df['Fecha']])
df['Fecha'].update(listb)
df['Hora'] = lista

#change Hora position
#I am not sure whether this is efficient or not
#as I am also quite new to Pandas
col = df.columns.tolist()
col = col[-1:]+col[:-1]
col[0], col[1] = col[1], col[0]

df = df[col]

print df

希望这可以解决您的问题，这是输出

grouped.mean()                                                                         

Fecha     DirRes
2011/07/01 -3 
2011/07/02 -5
2011/07/03 -6

        Fecha   Hora  DirViento  MagViento
0  2011/07/01  00:00        318        6.6
1  2011/07/01  00:15        342        5.5
2  2011/07/01  00:30        329        6.6
3  2011/07/01  00:45        279        7.5
4  2011/07/01  01:00        318        6.0
5  2011/07/01  01:15        329        7.1
6  2011/07/01  01:30        300        4.7
7  2011/07/01  01:45        291        3.1

grouped.mean()                                                                         

Fecha     DirRes
2011/07/01 -3 
2011/07/02 -5
2011/07/03 -6

import pandas as p
t = p.read_csv('test2.csv')

#store into a data frame
df = p.DataFrame(t)


#update the fecha col value and create new col hora
lista = [item.split(' ')[2] for item in df['Fecha']]
listb = p.Series([item.split(' ')[0] for item in df['Fecha']])
df['Fecha'].update(listb)
df['Hora'] = lista

#change Hora position
#I am not sure whether this is efficient or not
#as I am also quite new to Pandas
col = df.columns.tolist()
col = col[-1:]+col[:-1]
col[0], col[1] = col[1], col[0]

df = df[col]

print df

希望这可以解决您的问题，这是输出

grouped.mean()                                                                         

Fecha     DirRes
2011/07/01 -3 
2011/07/02 -5
2011/07/03 -6

        Fecha   Hora  DirViento  MagViento
0  2011/07/01  00:00        318        6.6
1  2011/07/01  00:15        342        5.5
2  2011/07/01  00:30        329        6.6
3  2011/07/01  00:45        279        7.5
4  2011/07/01  01:00        318        6.0
5  2011/07/01  01:15        329        7.1
6  2011/07/01  01:30        300        4.7
7  2011/07/01  01:45        291        3.1

这并没有回答我的问题。将Fecha作为实际的datetime对象不是更好吗，例如传递parse_dates=['Fecha']以读取\u csv。我同意@AndyHayden，您可以传递到

read\u csv

parse\u dates

参数，这样会读取字符串并尝试将其作为datetime进行解析：

data=read\u csv（'enero.csv'，parse_dates=['Fecha']）

没有回答我的问题。将Fecha作为实际的datetime对象不是更好吗，例如传递parse_dates=['Fecha']要读取_csv。我同意@AndyHayden的说法，您可以传递给

read _csv

parse _dates

参数，该参数将读取字符串并尝试将其解析为日期时间，如下所示：

data=read _csv（'enero.csv'，parse_dates=['Fecha'）