Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/362.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 重塑数据帧:每76个条目新增一行_Python_Pandas - Fatal编程技术网

Python 重塑数据帧:每76个条目新增一行

Python 重塑数据帧:每76个条目新增一行,python,pandas,Python,Pandas,我是Python和Pandas的新手,正在通过UCI处理心脏病数据集 每个人和303个人都有76个属性,因此我想以每个人作为一行和76列结束。我很难将其安排到数据框中,因为数据似乎是以9行的形式显示的 我已尝试使用空格或换行符作为分隔符将数据集导入pandas dataframe,但仍无法阻止数据在每8个值后分割: df = pd.read_table('https://archive.ics.uci.edu/ml/machine-learning-databases/heart-disease

我是Python和Pandas的新手,正在通过UCI处理心脏病数据集

每个人和303个人都有76个属性,因此我想以每个人作为一行和76列结束。我很难将其安排到数据框中,因为数据似乎是以9行的形式显示的

我已尝试使用空格或换行符作为分隔符将数据集导入pandas dataframe,但仍无法阻止数据在每8个值后分割:

df = pd.read_table('https://archive.ics.uci.edu/ml/machine-learning-databases/heart-disease/hungarian.data', sep=' ')
df 结果如下表所示:

    1254    0   40  1   1.1 0.1 0.2
-9.0    2   140.0   0.0 289 -9.0    -9.0    -9.0
0.0 -9  -9.0    0.0 12  16.0    84.0    0.0
0.0 0   0.0 0.0 150 18.0    -9.0    7.0
172.0   86  200.0   110.0   140 86.0    0.0 0.0
0.0 -9  26.0    20.0    -9  -9.0    -9.0    -9.0
如果您有任何关于如何拆分此值并在第76个值之后创建新行的建议,我将不胜感激。每76个值都是字符串“name”,表示一个人数据的结束。谢谢大家!

由于预处理数据比处理“错误构建”的DF更容易:

输出:

In [20]: df
Out[20]:
       0   1   2   3   4   5   6   7   8    9   ...   66  67  68  69  70  71  72    73   74    75
0    1254   0  40   1   1   0   0  -9   2  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
1    1255   0  49   0   1   0   0  -9   3  160  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
2    1256   0  37   1   1   0   0  -9   2  130  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
3    1257   0  48   0   1   1   1  -9   4  138  ...    2  -9   1   1   1   1   1  -9.0 -9.0  name
4    1258   0  54   1   1   0   1  -9   3  150  ...    1  -9   1   1   1   1   1  -9.0 -9.0  name
5    1259   0  39   1   1   0   1  -9   3  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
6    1260   0  45   0   0   1   0  -9   2  130  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
7    1261   0  54   1   1   0   0  -9   2  110  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
8    1262   0  37   1   1   1   1  -9   4  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
9    1263   0  48   0   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
10   1264   0  37   0   1   0   1  -9   3  130  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
11   1265   0  58   1   1   0   0  -9   2  136  ...   -9   2   1   1   1   7   1  -9.0 -9.0  name
12   1266   0  39   1   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
13   1267   0  49   1   1   1   1  -9   4  140  ...    2  -9   1   1   1   1   1  -9.0 -9.0  name
14   1268   0  42   0   1   0   1  -9   3  115  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
15   1269   0  54   0   1   1   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
16   1270   0  38   1   1   1   1  -9   4  110  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
17   1271   0  43   0   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
18   1272   0  60   1   1   1   1  -9   4  100  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
19   1273   0  36   1   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
20   1274   0  43   0   0   0   0  -9   1  100  ...   -9  -9   1   1   1   1   2  -9.0 -9.0  name
21   1275   0  44   1   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
22   1276   0  49   0   1   0   0  -9   2  124  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
23   1277   0  44   1   1   0   0  -9   2  150  ...    2  -9   1   1   1   1   1  67.0 -9.0  name
24   1278   0  40   1   1   0   1  -9   3  130  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
..    ...  ..  ..  ..  ..  ..  ..  ..  ..  ...  ...   ..  ..  ..  ..  ..  ..  ..   ...  ...   ...
269  1032   0  54   1   1   1   0  -9   4  130  ...   -9   2   1   1   1   7   1  66.0 -9.0  name
270  1033   0  47   0   1   0   0  -9   3  130  ...   -9  -9   1   1   1   1   1  68.0 -9.0  name
271  1034   0  45   1   1   1   1  -9   4  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
272  1035   0  32   0   1   0   0  -9   2  105  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
273  1036   0  55   1   1   1   1  -9   4  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
274  1037   0  55   1   1   0   0  -9   3  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
275  1038   0  45   0   0   0   0  -9   2  180  ...   -9  -9   1   1   1   1   1  70.0 -9.0  name
276  1039   0  59   1   1   0   1  -9   3  180  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
277  1041   0  51   1   1   0   0  -9   3  135  ...    2  -9   1   1   3   8   2  -9.0 -9.0  name
278  1042   0  52   1   1   1   1  -9   4  170  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
279  1043   0  57   0   1   1   1  -9   4  180  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
280  1044   0  54   0   1   0   0  -9   2  130  ...   -9  -9   1   1   1   1   3  -9.0 -9.0  name
281  1045   0  60   1   1   0   0  -9   3  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
282  1046   0  49   1   1   1   1  -9   4  150  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
283  1047   0  51   0   1   0   1  -9   3  130  ...   -9  -9   1   1   1   1   1  61.0 -9.0  name
284  1048   0  55   0   0   0   0  -9   2  110  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
285  1049   0  42   1   1   1   1  -9   4  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
286  1050   0  51   0   1   0   1  -9   3  110  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
287  1051   0  59   1   1   1   1  -9   4  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
288  1052   0  53   1   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
289  1053   0  48   0   0   0   0  -9   2   -9  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
290  1054   0  36   1   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
291  5001   0  48   1   0   0   0  -9   3  110  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
292  5000   0  47   0   0   0   0  -9   2  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
293  5002   0  53   1   1   1   1  -9   4  130  ...    1   1   1   1   1   1   1  -9.0 -9.0  name

[294 rows x 76 columns]

这是可行但痛苦的数据帧破坏。由于输入文件没有那么大,我将处理输入字符串并替换\n和name,以获得对齐的行,从而为read\u表服务
In [20]: df
Out[20]:
       0   1   2   3   4   5   6   7   8    9   ...   66  67  68  69  70  71  72    73   74    75
0    1254   0  40   1   1   0   0  -9   2  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
1    1255   0  49   0   1   0   0  -9   3  160  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
2    1256   0  37   1   1   0   0  -9   2  130  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
3    1257   0  48   0   1   1   1  -9   4  138  ...    2  -9   1   1   1   1   1  -9.0 -9.0  name
4    1258   0  54   1   1   0   1  -9   3  150  ...    1  -9   1   1   1   1   1  -9.0 -9.0  name
5    1259   0  39   1   1   0   1  -9   3  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
6    1260   0  45   0   0   1   0  -9   2  130  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
7    1261   0  54   1   1   0   0  -9   2  110  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
8    1262   0  37   1   1   1   1  -9   4  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
9    1263   0  48   0   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
10   1264   0  37   0   1   0   1  -9   3  130  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
11   1265   0  58   1   1   0   0  -9   2  136  ...   -9   2   1   1   1   7   1  -9.0 -9.0  name
12   1266   0  39   1   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
13   1267   0  49   1   1   1   1  -9   4  140  ...    2  -9   1   1   1   1   1  -9.0 -9.0  name
14   1268   0  42   0   1   0   1  -9   3  115  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
15   1269   0  54   0   1   1   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
16   1270   0  38   1   1   1   1  -9   4  110  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
17   1271   0  43   0   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
18   1272   0  60   1   1   1   1  -9   4  100  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
19   1273   0  36   1   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
20   1274   0  43   0   0   0   0  -9   1  100  ...   -9  -9   1   1   1   1   2  -9.0 -9.0  name
21   1275   0  44   1   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
22   1276   0  49   0   1   0   0  -9   2  124  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
23   1277   0  44   1   1   0   0  -9   2  150  ...    2  -9   1   1   1   1   1  67.0 -9.0  name
24   1278   0  40   1   1   0   1  -9   3  130  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
..    ...  ..  ..  ..  ..  ..  ..  ..  ..  ...  ...   ..  ..  ..  ..  ..  ..  ..   ...  ...   ...
269  1032   0  54   1   1   1   0  -9   4  130  ...   -9   2   1   1   1   7   1  66.0 -9.0  name
270  1033   0  47   0   1   0   0  -9   3  130  ...   -9  -9   1   1   1   1   1  68.0 -9.0  name
271  1034   0  45   1   1   1   1  -9   4  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
272  1035   0  32   0   1   0   0  -9   2  105  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
273  1036   0  55   1   1   1   1  -9   4  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
274  1037   0  55   1   1   0   0  -9   3  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
275  1038   0  45   0   0   0   0  -9   2  180  ...   -9  -9   1   1   1   1   1  70.0 -9.0  name
276  1039   0  59   1   1   0   1  -9   3  180  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
277  1041   0  51   1   1   0   0  -9   3  135  ...    2  -9   1   1   3   8   2  -9.0 -9.0  name
278  1042   0  52   1   1   1   1  -9   4  170  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
279  1043   0  57   0   1   1   1  -9   4  180  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
280  1044   0  54   0   1   0   0  -9   2  130  ...   -9  -9   1   1   1   1   3  -9.0 -9.0  name
281  1045   0  60   1   1   0   0  -9   3  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
282  1046   0  49   1   1   1   1  -9   4  150  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
283  1047   0  51   0   1   0   1  -9   3  130  ...   -9  -9   1   1   1   1   1  61.0 -9.0  name
284  1048   0  55   0   0   0   0  -9   2  110  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
285  1049   0  42   1   1   1   1  -9   4  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
286  1050   0  51   0   1   0   1  -9   3  110  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
287  1051   0  59   1   1   1   1  -9   4  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
288  1052   0  53   1   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
289  1053   0  48   0   0   0   0  -9   2   -9  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
290  1054   0  36   1   1   0   0  -9   2  120  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
291  5001   0  48   1   0   0   0  -9   3  110  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
292  5000   0  47   0   0   0   0  -9   2  140  ...   -9  -9   1   1   1   1   1  -9.0 -9.0  name
293  5002   0  53   1   1   1   1  -9   4  130  ...    1   1   1   1   1   1   1  -9.0 -9.0  name

[294 rows x 76 columns]