Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/322.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 按间隔在每个单元格上创建带有标签的矩阵_Python_Python 3.x_Pandas_Numpy - Fatal编程技术网

Python 按间隔在每个单元格上创建带有标签的矩阵

Python 按间隔在每个单元格上创建带有标签的矩阵,python,python-3.x,pandas,numpy,Python,Python 3.x,Pandas,Numpy,我有用于填写观察矩阵的箱子和数据: a = array([0., 14., 29., 43., 58., 72., 86., 101., 115., 130., 144.]) b = array([10, 26, 36, 48, 64, 71, 91, 105, 123, 133, 141]) 我期望的结果是: 0-13 14-28 29-42 43-57 58-71 72-85 86-100 101-114 115-129 130-144 10 1 0 0

我有用于填写观察矩阵的箱子和数据:

a = array([0.,  14.,  29.,  43.,  58.,  72.,  86., 101., 115., 130., 144.])
b = array([10, 26, 36, 48, 64, 71, 91, 105, 123, 133, 141])
我期望的结果是:

   0-13 14-28 29-42 43-57 58-71 72-85 86-100 101-114 115-129 130-144
10  1     0     0     0     0     0     0       0       0       0    
26  0     1     0     0     0     0     0       0       0       0 
36  0     0     1     0     0     0     0       0       0       0 
48  0     0     0     1     0     0     0       0       0       0 
64  0     0     0     0     1     0     0       0       0       0 
71  0     0     0     0     1     0     0       0       0       0 
91  0     0     0     0     0     0     1       0       0       0 
切割+获取虚拟对象 这里有一个方法:

import numpy as np
import pandas as pd

a = np.array([0.,  14.,  29.,  43.,  58.,  72.,  86., 101., 115., 130., 144.])
b = np.array([10, 26, 36, 48, 64, 71, 91, 105, 123, 133, 141])

df = pd.DataFrame({'Values': b})

df['Range'] = pd.cut(df['Values'], a)

dummies = pd.get_dummies(df['Range'])

res = pd.concat([df, dummies], axis=1)

print(res)
解释

  • 如果未提供任何标签,则使用与范围相关的默认标签
  • 将序列扩展为“一个热编码”格式
  • 允许您将原始数据帧加入
    get\u dummies
    的输出
  • 或者,您可以通过
    res=res.set_index('Values')
    值设置为索引
结果

print(res)

    Values       Range  (0, 14]  (14, 29]  (29, 43]  (43, 58]  (58, 72]  \
0       10     (0, 14]        1         0         0         0         0   
1       26    (14, 29]        0         1         0         0         0   
2       36    (29, 43]        0         0         1         0         0   
3       48    (43, 58]        0         0         0         1         0   
4       64    (58, 72]        0         0         0         0         1   
5       71    (58, 72]        0         0         0         0         1   
6       91   (86, 101]        0         0         0         0         0   
7      105  (101, 115]        0         0         0         0         0   
8      123  (115, 130]        0         0         0         0         0   
9      133  (130, 144]        0         0         0         0         0   
10     141  (130, 144]        0         0         0         0         0   

    (72, 86]  (86, 101]  (101, 115]  (115, 130]  (130, 144]  
0          0          0           0           0           0  
1          0          0           0           0           0  
2          0          0           0           0           0  
3          0          0           0           0           0  
4          0          0           0           0           0  
5          0          0           0           0           0  
6          0          1           0           0           0  
7          0          0           1           0           0  
8          0          0           0           1           0  
9          0          0           0           0           1  
10         0          0           0           0           1  
与一起使用,按
b
数组最后添加索引:

labels = ['{}-{}'.format(i, j - 1) for i, j in zip(a[:-1].astype(int), a[1:].astype(int))] 
d = pd.get_dummies((pd.cut(b, a, labels=labels))).set_index(b)
print (d)
     0-13  14-28  29-42  43-57  58-71  72-85  86-100  101-114  115-129  \
10      1      0      0      0      0      0       0        0        0   
26      0      1      0      0      0      0       0        0        0   
36      0      0      1      0      0      0       0        0        0   
48      0      0      0      1      0      0       0        0        0   
64      0      0      0      0      1      0       0        0        0   
71      0      0      0      0      1      0       0        0        0   
91      0      0      0      0      0      0       1        0        0   
105     0      0      0      0      0      0       0        1        0   
123     0      0      0      0      0      0       0        0        1   
133     0      0      0      0      0      0       0        0        0   
141     0      0      0      0      0      0       0        0        0   

     130-143  
10         0  
26         0  
36         0  
48         0  
64         0  
71         0  
91         0  
105        0  
123        0  
133        1  
141        1  
如果希望最后一个标签更改为
144
,以下是解决方案:

a1 = a[:-1].astype(int)
a2 = a[1:].astype(int)
a2[-1] += 1
labels = ['{}-{}'.format(i, j - 1) for i, j in zip(a1, a2)] 
d = pd.get_dummies((pd.cut(b, a, labels=labels))).set_index(b)
print (d)
     0-13  14-28  29-42  43-57  58-71  72-85  86-100  101-114  115-129  \
10      1      0      0      0      0      0       0        0        0   
26      0      1      0      0      0      0       0        0        0   
36      0      0      1      0      0      0       0        0        0   
48      0      0      0      1      0      0       0        0        0   
64      0      0      0      0      1      0       0        0        0   
71      0      0      0      0      1      0       0        0        0   
91      0      0      0      0      0      0       1        0        0   
105     0      0      0      0      0      0       0        1        0   
123     0      0      0      0      0      0       0        0        1   
133     0      0      0      0      0      0       0        0        0   
141     0      0      0      0      0      0       0        0        0   

     130-144  
10         0  
26         0  
36         0  
48         0  
64         0  
71         0  
91         0  
105        0  
123        0  
133        1  
141        1  

我有两个错误:ValueError:无法将字符串转换为float:“Range”两次。@VasyaPravdin,我建议您在新的Python会话中复制粘贴上述代码,而不做任何更改。如果它有效,那么您如何使代码适应您的应用程序就有问题了。@VasyaPravdin-我的解决方案呢?相同的错误?@VasyaPravdin,请参阅更新
pd.concat应该适合您。我认为这是一个版本问题。已选中。@jpp更新版本正在运行!谢谢你的解释。