Python 如何将数据帧的列从浮点更改为整数(PANDES)

Python 如何将数据帧的列从浮点更改为整数(PANDES),python,pandas,dataframe,Python,Pandas,Dataframe,因此,我试图逐个单元格编辑整个列,将包含整数和字符串的列更改为仅包含整数组件 数据框中的实际列: 0 11212; xxxxxxxxxx xxxxxxxx 1 11212; xxxxxxxxxx xxxxxxxx 2 11212; xxxxxxxxxx xxxxxxxx 3

因此,我试图逐个单元格编辑整个列,将包含整数和字符串的列更改为仅包含整数组件

数据框中的实际列:

0                           11212; xxxxxxxxxx xxxxxxxx   
1                           11212; xxxxxxxxxx xxxxxxxx   
2                           11212; xxxxxxxxxx xxxxxxxx   
3                           11212; xxxxxxxxxx xxxxxxxx     
8                  667788; xxxxxxx xxxxxxxxxxxxx xxxxxx   
9                  55555; xxxxxxx xxxxxxxxxxxxx xxxxxx   
10                 55555; xxxxxxx xxxxxxxxxxxxx xxxxxx   
11                 55555; xxxxxxx xxxxxxxxxxxxx xxxxxx   
12                 33333; xxxxxxx xxxxxxxxxxxxx xxxxxx   
13                 333; xxx xxxxx @ xxx xxx 2 xxxx   
14                 9991; xxxx; xxxxxx xxxxx xxxx @ 2 xxx   
18                       1635; vvvvvvvvvvvv vvvvvv 10   
19                       1635; vvvvvvvvvvvv vvvvvv 10   
20                       1635; vvvvvvvvvvvv vvvvvv 10   
21                       1635; vvvvvvvvvvvv vvvvvv 10     
32                       1712; Cxxxx xxxxxxxx; xxx 0   
33                       1712; Cxxxx xxxxxxxx; xxx 0   
34                       1712; Cxxxx xxxxxxxx; xxx 0   
35                       1712; Cxxxx xxxxxxxx; xxx 0
这是我正在运行的代码

 import pandas as pd 
    import re

    # import excel file from Trello
    xlsx = pd.ExcelFile("/home/deon/Documents/Work_Stuff/Trello.xls") 
    # create data frame from excel file on sheet 1
    df2 = pd.read_excel(xlsx,'Sheet1')
    df3 = pd.DataFrame(data=df2)

    # delete columns not relative to us
    df3.drop(df3.columns[[0,5,10,11]],inplace=True,axis=1)
    df3.columns= "Date*", "Due date", "Week*", "Card", "Board", "List", "S", "E 1st"

    df3[:, 6] = df3.iloc[:,6].apply(lambda x: x.split(';')[0]) 
    print df2.head()


# Also tried
    digits = df3.iloc[:, 4].apply(lambda x: re.findall('\d+', str(x)))
    df3.iloc[:, 4] = digits.str.get(0).astype(int)
    print df3.head()

您已经有了拆分字符串的一般想法,但在引用数据帧时遇到了麻烦。更大致的是:

代码:

df['number'] = df.raw_string.apply(lambda x: int(x.split(';')[0]))
data = [x.strip() for x in """
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
    667788; xxxxxxx xxxxxxxxxxxxx xxxxxx
    55555; xxxxxxx xxxxxxxxxxxxx xxxxxx
    55555; xxxxxxx xxxxxxxxxxxxx xxxxxx
""".split('\n')[1:-1]]

import pandas as pd
df = pd.DataFrame(data=data, columns=['raw_string'])

df['number'] = df.raw_string.apply(lambda x: int(x.split(';')[0]))

print(df.head())
                             raw_string  number
0            11212; xxxxxxxxxx xxxxxxxx   11212
1            11212; xxxxxxxxxx xxxxxxxx   11212
2            11212; xxxxxxxxxx xxxxxxxx   11212
3            11212; xxxxxxxxxx xxxxxxxx   11212
4  667788; xxxxxxx xxxxxxxxxxxxx xxxxxx  667788
测试代码:

df['number'] = df.raw_string.apply(lambda x: int(x.split(';')[0]))
data = [x.strip() for x in """
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
    667788; xxxxxxx xxxxxxxxxxxxx xxxxxx
    55555; xxxxxxx xxxxxxxxxxxxx xxxxxx
    55555; xxxxxxx xxxxxxxxxxxxx xxxxxx
""".split('\n')[1:-1]]

import pandas as pd
df = pd.DataFrame(data=data, columns=['raw_string'])

df['number'] = df.raw_string.apply(lambda x: int(x.split(';')[0]))

print(df.head())
                             raw_string  number
0            11212; xxxxxxxxxx xxxxxxxx   11212
1            11212; xxxxxxxxxx xxxxxxxx   11212
2            11212; xxxxxxxxxx xxxxxxxx   11212
3            11212; xxxxxxxxxx xxxxxxxx   11212
4  667788; xxxxxxx xxxxxxxxxxxxx xxxxxx  667788
结果:

df['number'] = df.raw_string.apply(lambda x: int(x.split(';')[0]))
data = [x.strip() for x in """
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
    667788; xxxxxxx xxxxxxxxxxxxx xxxxxx
    55555; xxxxxxx xxxxxxxxxxxxx xxxxxx
    55555; xxxxxxx xxxxxxxxxxxxx xxxxxx
""".split('\n')[1:-1]]

import pandas as pd
df = pd.DataFrame(data=data, columns=['raw_string'])

df['number'] = df.raw_string.apply(lambda x: int(x.split(';')[0]))

print(df.head())
                             raw_string  number
0            11212; xxxxxxxxxx xxxxxxxx   11212
1            11212; xxxxxxxxxx xxxxxxxx   11212
2            11212; xxxxxxxxxx xxxxxxxx   11212
3            11212; xxxxxxxxxx xxxxxxxx   11212
4  667788; xxxxxxx xxxxxxxxxxxxx xxxxxx  667788

在我看来,第一个代码块将以字符串形式提供您想要的内容。您只需要将其转换为int。您从中得到了什么输出?AttributeError:'numpy.float64'对象没有属性'split',对于我收到的第二个示例:ValueError:无法将NA转换为integerInteresting。你能再给我一点关于这个数据框架是如何构建的信息吗?列的一个条目中的变量类型是什么?你能在上面调用type()并在这里发布吗?'code'(0 11212;XXXXXXXXXXXXXXXX 1 11212;XXXXXXXXXXXXXXXX 2 11212;XXXXXXXXXXXXXXXX xxxxxxxx 3 11212;XXXXXXXXXXXXXXXX xxxxxxxx 8 667788;XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 9 5555;XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX1055555;XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX)