Python 熊猫如何不适用于整列
当数据框中遇到非数值时,上述代码为字段Python 熊猫如何不适用于整列,python,pandas,Python,Pandas,当数据框中遇到非数值时,上述代码为字段常规价格的每个值分配None。我只想将None分配给其非数字值所在的单元格 谢谢首先是不可能返回带有整数的NaNs,因为NaNs是设计的float 如果混合类型-带字符串的数值s,则您的解决方案有效: self.df['Regular Price'] = self.df['Regular Price'].apply( lambda x: int(round(x)) if isinstance( x, (
常规价格
的每个值分配None。我只想将None
分配给其非数字值所在的单元格
谢谢首先是不可能返回带有
整数的NaN
s,因为NaN
s是设计的float
如果混合
类型-带字符串的数值
s,则您的解决方案有效:
self.df['Regular Price'] = self.df['Regular Price'].apply(
lambda x: int(round(x)) if isinstance(
x, (int, float)) else None
)
但是,如果所有数据都是字符串,则需要使用errors='concurve'
将非数值转换为NaN
s:
df = pd.DataFrame({
'Regular Price': ['a',1,2.3,'a',7],
'B': list(range(5))
})
print (df)
B Regular Price
0 0 a
1 1 1
2 2 2.3
3 3 a
4 4 7
df['Regular Price'] = df['Regular Price'].apply(
lambda x: int(round(x)) if isinstance(
x, (int, float)) else None
)
print (df)
B Regular Price
0 0 NaN
1 1 1.0
2 2 2.0
3 3 NaN
4 4 7.0
编辑:
我还需要删除浮点并仅使用int
可以将NaN
s转换为None
,并转换为int
:
df = pd.DataFrame({
'Regular Price': ['a','1','2.3','a','7'],
'B': list(range(5))
})
print (df)
B Regular Price
0 0 a
1 1 1
2 2 2.3
3 3 a
4 4 7
df['Regular Price'] = pd.to_numeric(df['Regular Price'], errors='coerce').round()
print (df)
B Regular Price
0 0 NaN
1 1 1.0
2 2 2.0
3 3 NaN
4 4 7.0
TypeError:-:“int”和“NoneType”的操作数类型不受支持
编辑:
首先可以按列删除NaN
s行Regular Price
,然后转换为int
df = pd.DataFrame({
'Regular Price': ['a','1','2.3','a','7'],
'B': list(range(5))
})
print (df)
B Regular Price
0 0 a
1 1 1
2 2 2.3
3 3 a
4 4 7
df['Regular Price'] = pd.to_numeric(df['Regular Price'], errors='coerce').round()
print (df)
B Regular Price
0 0 NaN
1 1 1.0
2 2 2.0
3 3 NaN
4 4 7.0
处理你需要的,但不要改变索引
df1 = df.dropna(subset=['Regular Price']).copy()
df1['Regular Price'] = df1['Regular Price'].astype(int)
print (df1)
B Regular Price
1 1 1
2 2 2
4 4 7
最后-将NaN
添加到常规价格
列中
#e.g. some process
df1['Regular Price'] = df1['Regular Price'] * 100
我不知道我是否理解你的问题,但你是在寻找applymap
方法吗?也许你是在寻找@Ivanapplymap
可以与完整的数据框架一起工作,或者它可以应用于一个特定的列?@Raheel Khan这就是apply
和applymap
之间的区别。看看她还好吧,如果转换成对象
效果会好吗?这是一个有点黑客,所以要仔细测试。谢谢。不,这是第一步。但是integer
与NaN
s之间存在问题,并且它不是本机支持的,所以只有黑客解决方案。想法-可以改为None
使用一些整数,如-1
或-1000
?最好是测试它。在np中将None
更改为”
。我有主意了。是否可以先筛选NaN
s,然后仅应用int
s解决方案,最后添加NaN
s?
In [274]: %timeit df['Regular Price3'] = df['Regular Price'].diff()
1000 loops, best of 3: 301 µs per loop
In [272]: %timeit df['Regular Price2'] = df['Regular Price1'] * 1000
100 loops, best of 3: 4.48 ms per loop
In [273]: %timeit df['Regular Price3'] = df['Regular Price'] * 1000
1000 loops, best of 3: 469 µs per loop
df = pd.DataFrame({
'Regular Price': ['a','1','2.3','a','7'],
'B': list(range(5))
})
print (df)
B Regular Price
0 0 a
1 1 1
2 2 2.3
3 3 a
4 4 7
df['Regular Price'] = pd.to_numeric(df['Regular Price'], errors='coerce').round()
print (df)
B Regular Price
0 0 NaN
1 1 1.0
2 2 2.0
3 3 NaN
4 4 7.0
df1 = df.dropna(subset=['Regular Price']).copy()
df1['Regular Price'] = df1['Regular Price'].astype(int)
print (df1)
B Regular Price
1 1 1
2 2 2
4 4 7
#e.g. some process
df1['Regular Price'] = df1['Regular Price'] * 100
df2 = df1.combine_first(df)
print (df2)
B Regular Price
0 0.0 NaN
1 1.0 100.0
2 2.0 200.0
3 3.0 NaN
4 4.0 700.0