Python len()与arange()的行为
使用此数据帧,dff:Python len()与arange()的行为,python,pandas,Python,Pandas,使用此数据帧,dff: A B 0 0 a 1 1 a 2 2 b 3 3 b 4 4 b 5 5 b 6 6 c 7 7 c 我知道如何len(dff)==8 但是,我不理解以下方面的答案: dff['counts'] = np.arange(len(dff)) 那是 A B counts 0 0 a 0 1 1 a 1 2 2 b 2 3 3 b
A B
0 0 a
1 1 a
2 2 b
3 3 b
4 4 b
5 5 b
6 6 c
7 7 c
我知道如何len(dff)==8
但是,我不理解以下方面的答案:
dff['counts'] = np.arange(len(dff))
那是
A B counts
0 0 a 0
1 1 a 1
2 2 b 2
3 3 b 3
4 4 b 4
5 5 b 5
6 6 c 6
7 7 c 7
dff['counts']
每行不应该是8吗?引擎盖下面发生了什么事?您似乎误解了:
此处df的长度用于设置停止
参数:
从文档中:
numpy.arange([start, ]stop, [step, ]dtype=None)
Return evenly spaced values within a given interval.
Values are generated within the half-open interval [start, stop) (in other words, the interval including start but excluding stop). For integer arguments the function is equivalent to the Python built-in range function, but returns an ndarray rather than a list.
When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases.
Parameters:
start : number, optional
Start of interval. The interval includes this value. The default start value is 0.
stop : number
End of interval. The interval does not include this value, except in some cases where step is not an integer and floating point round-off affects the length of out.
step : number, optional
Spacing between values. For any output out, this is the distance between two adjacent values, out[i+1] - out[i]. The default step size is 1. If step is specified, start must also be given.
dtype : dtype
The type of the output array. If dtype is not given, infer the data type from the other input arguments.
Returns:
arange : ndarray
Array of evenly spaced values.
For floating point arguments, the length of the result is ceil((stop - start)/step). Because of floating point overflow, this rule may result in the last element of out being greater than stop.
如果您想将每一行设置为相同的值,您可以
In [34]:
dff['counts'] = len(dff)
dff
Out[34]:
A B counts
0 0 a 8
1 1 a 8
2 2 b 8
3 3 b 8
4 4 b 8
5 5 b 8
6 6 c 8
7 7 c 8
谢谢你的澄清。我想我不习惯数据帧和列表在长度相同时自动合并。如果传递标量,我会将所有行设置为相同的值;如果长度相同,我会对齐类似数组的对象;如果是序列/数据帧,我会沿索引/列对齐。这是预期的行为,例如,如果您尝试
dff['counts']=np.arange(7)
您能给出一个“如果是一个系列/数据帧,则沿索引/列对齐”的快速示例吗?基本上,如果执行了以下操作,您将看到我的意思dff['c']=pd.Series(np.arange(8),index=np.arange(3,11))`您将看到,由于另一个系列索引从3开始,因此该索引值中的值被分配,前3行最终为NaN
In [34]:
dff['counts'] = len(dff)
dff
Out[34]:
A B counts
0 0 a 8
1 1 a 8
2 2 b 8
3 3 b 8
4 4 b 8
5 5 b 8
6 6 c 8
7 7 c 8