Python 使用索引替换选定的单元格_Python_Pandas_Dataframe

Python 使用索引替换选定的单元格

python pandas dataframe

Python 使用索引替换选定的单元格,python,pandas,dataframe,Python,Pandas,Dataframe,我想使用索引替换选定的单元格我发现单个单元格可以使用 df.loc 但是，我想更改一大组数据我掌握的数据是 Colour 0 R 1 R 2 R 3 P 4 P 5 P . . . 1000 Y 1001 Y 1002 Y 太大，无法单独更改我希望输出是 Colour 0 Red 1 Red 2 Re

我想使用索引替换选定的单元格

我发现单个单元格可以使用

df.loc

但是，我想更改一大组数据

我掌握的数据是

太大，无法单独更改

我希望输出是

        Colour

0       Red

1       Red

2       Red

3       Pink

4       Purple

5       Purple

.
.
.

1000    Yellow

1001    Yellow
 
1002    Yellow

我想使用索引范围来替换单元格，因为相同的单元格（相同的颜色）是连续的

“红色”

的索引为[：3]，

“粉色”

的索引为[3:4]，

“紫色”

的索引为[4:1000]，

“黄色”

的索引为[1000:]。

使用数据帧的

替换方法：
>>> df = pd.DataFrame(["R", "R", "B", "B", "Y", "Y"])
>>> df.replace({"R": "Red", "B": "Blue", "Y": "Yellow"})
        0
0     Red
1     Red
2    Blue
3    Blue
4  Yellow
5  Yellow

先试试这个方法，如果速度太慢，我们会找到另一个解决方案
编辑：
idx = [("Red", (0, 3)),
       ("Pink", (3, 4)),
       ("Purple", (4, 1000)),
       ("Yellow", (1000, 1003)]  # or len(df)

clr = pd.Series(itertools.chain(*[[c] * (i[1] - i[0]) for c, i in idx]))

>>> clr
0          Red
1          Red
2          Red
3         Pink
4       Purple
         ...
998     Purple
999     Purple
1000    Yellow
1001    Yellow
1002    Yellow
Length: 1003, dtype: object

如果需要按值的范围设置值，此处在元组字典中定义，使用列表理解按范围中的值重复键，然后分配给新列：
np.random.seed(123)
df = pd.DataFrame({'A':np.random.randint(10, size=15)})

d = {"Red": (0, 3),
     "Pink" :(3, 4),
      "Purple":(4, 10),
      "Yellow" :(10, len(df))}
 
r = [k for k, (s, e) in d.items() for x in range(s, e)]
print (r)
['Red', 'Red', 'Red', 
 'Pink', 
 'Purple', 'Purple', 'Purple', 'Purple', 'Purple', 'Purple',
 'Yellow', 'Yellow', 'Yellow', 'Yellow', 'Yellow']

df['new'] = r
print (df)
    A     new
0   2     Red
1   2     Red
2   6     Red
3   1    Pink
4   3  Purple
5   9  Purple
6   6  Purple
7   1  Purple
8   0  Purple
9   1  Purple
10  9  Yellow
11  0  Yellow
12  0  Yellow
13  9  Yellow
14  3  Yellow

对于一个更简单的方法-因为你知道指数，你可以计算出你期望看到每个值的次数（见3个红色，1个粉色，996个紫色，3个黄色）。
您可以使用此信息非常快速地构造分类数组
codes = np.repeat([0, 1, 2, 3], [3, 1, 996, 3])
labels = ["Red", "Pink", "Purple", "Yellow"]

out = pd.Categorical.from_codes(categories=labels, codes=codes)
print(out)

['Red', 'Red', 'Red', 'Pink', 'Purple', ..., 'Purple', 'Purple', 'Yellow', 'Yellow', 'Yellow']
Length: 1003
Categories (4, object): ['Red', 'Pink', 'Purple', 'Yellow']

不幸的是，我有不同颜色的相同字母。。。那样的话，我怎么能改变呢？我已经编辑了我的问题。除非有办法区分“P”
即“粉色”
和“P”
即“紫色”
之间的区别，否则就没有办法了。我只知道索引。我知道“粉色”
的索引是3，“紫色”
的索引是4到7。用索引列表/目录更新您的问题。@Lyliie-然后按索引位置创建列表或数组，并重新分配。只需解释如何使用开始和结束索引值指定值。