Python 如何在pandas series对象中将二维数组的字符串表示转换为numpy数组?

Python 如何在pandas series对象中将二维数组的字符串表示转换为numpy数组?,python,arrays,pandas,numpy,Python,Arrays,Pandas,Numpy,我有我的“train”数据框,它的两列是“feature”和“class_label” 序列['feature']系列对象的每个值都是二维数组的字符串类型表示。 例如,print(序列['feature'][2]给出了以下信息:- [-2.1579301e+02 7.1666122e+01 -1.3181377e+02 -5.2091331e+01 -2.2115967e+01 -2.1764179e+01 -1.1183747e+01 1.8912683e+01 6.7266378e

我有我的“train”数据框,它的两列是“feature”和“class_label” 序列['feature']系列对象的每个值都是二维数组的字符串类型表示。 例如,
print(序列['feature'][2]
给出了以下信息:-

[-2.1579301e+02  7.1666122e+01 -1.3181377e+02 -5.2091331e+01
 -2.2115967e+01 -2.1764179e+01 -1.1183747e+01  1.8912683e+01
  6.7266378e+00  1.4556893e+01 -1.1782046e+01  2.3010368e+00
 -1.7251303e+01  1.0052422e+01 -6.0094991e+00 -1.3153193e+00
 -1.7693510e+01  1.1171223e+00 -4.3699460e+00  7.2629538e+00
 -1.1815971e+01 -7.4952617e+00  5.4577127e+00 -2.9442446e+00
 -5.8693886e+00 -9.8653756e-02 -3.2121708e+00  4.6092505e+00
 -5.8293266e+00 -5.3475084e+00  1.3341197e+00  7.1307821e+00
 -7.9449967e-02  1.7109249e+00 -5.6942000e+00 -2.9041717e+00
  3.0366950e+00 -1.6827592e+00 -8.8585818e-01  3.5438862e-01]
和**打印(类型(序列['feature'][2])给出
*
*

如何将其转换为类似维度的numpy数组,即2d?

您可以使用

  • 范例 >>>np.fromstring('12',dtype=int,sep='')


    数组([1,2])


    >>>np.fromstring('1,2',dtype=int,sep=',')


    数组([1,2])


您可以使用

  • 范例 >>>np.fromstring('12',dtype=int,sep='')


    数组([1,2])


    >>>np.fromstring('1,2',dtype=int,sep=',')


    数组([1,2])



列表的表示形式在第一个和最后一个位置包含方括号。您只需删除它们,就可以得到一个适用于
np的字符串。fromstring

arr = np.fromstring(train['feature'][2].strip().strip('[]')
它应提供:

array([6.12958628e-062, 6.03978204e-154, 6.38419835e-067, 8.24229531e-072,
       1.04032262e-042, 4.25805009e-086, 2.78795575e+179, 3.65482021e-086,
       3.76241284e+179, 8.54516547e-072, 1.21290955e-099, 9.72378324e-072,
       1.18295532e-076, 1.42737056e-071, 9.15162000e-072, 1.24635843e-047,
       1.39642642e-076, 1.45109844e-047, 1.28195170e-153, 1.57347469e-076,
       6.03832533e-154, 1.31291338e-047, 2.46065208e-091, 1.65747229e-076,
       4.56342568e-072, 6.75219849e-067, 4.45912655e-086, 2.79176962e+179,
       3.66964585e-062, 1.21157186e-099, 6.76930177e-043, 1.21089518e-099,
       9.72378280e-072, 1.18295268e-076, 4.27554979e-033, 1.39642638e-076,
       3.50744645e-033, 1.31607698e-259, 1.79779973e-052, 1.28195170e-153,
       1.24583535e-047, 6.03686863e-154, 6.38565290e-067, 5.40166782e-067,
       1.38240632e-047, 1.52040406e-052, 1.65716760e-047, 5.06235594e-086,
       3.59968426e+179, 5.60088410e-067, 1.21268023e-099, 1.91554389e-076,
       1.18295399e-076, 8.96904553e-067, 1.18295467e-076, 9.05795911e-043,
       1.39642640e-076, 1.14327947e-071, 1.20736829e-153, 6.81442337e-038,
       2.44150975e-154, 4.85409878e-033, 6.03978209e-154, 2.90315378e-057,
       1.52043433e-052, 1.39804594e-076, 3.65482066e-086, 3.75924203e+179,
       3.84568576e-086, 2.62714652e+179, 1.03291348e-047, 1.21134863e-099,
       2.52456666e-052, 1.26931766e-076, 4.91222856e-062, 9.15604169e-072])
带有
(76,)
形状

您只需对其进行
整形
即可获得2D numpy阵列:

arr.reshape(19,4)
array([[6.12958628e-062, 6.03978204e-154, 6.38419835e-067,
        8.24229531e-072],
       [1.04032262e-042, 4.25805009e-086, 2.78795575e+179,
        3.65482021e-086],
       [3.76241284e+179, 8.54516547e-072, 1.21290955e-099,
        9.72378324e-072],
       [1.18295532e-076, 1.42737056e-071, 9.15162000e-072,
        1.24635843e-047],
       [1.39642642e-076, 1.45109844e-047, 1.28195170e-153,
        1.57347469e-076],
       [6.03832533e-154, 1.31291338e-047, 2.46065208e-091,
        1.65747229e-076],
       [4.56342568e-072, 6.75219849e-067, 4.45912655e-086,
        2.79176962e+179],
       [3.66964585e-062, 1.21157186e-099, 6.76930177e-043,
        1.21089518e-099],
       [9.72378280e-072, 1.18295268e-076, 4.27554979e-033,
        1.39642638e-076],
       [3.50744645e-033, 1.31607698e-259, 1.79779973e-052,
        1.28195170e-153],
       [1.24583535e-047, 6.03686863e-154, 6.38565290e-067,
        5.40166782e-067],
       [1.38240632e-047, 1.52040406e-052, 1.65716760e-047,
        5.06235594e-086],
       [3.59968426e+179, 5.60088410e-067, 1.21268023e-099,
        1.91554389e-076],
       [1.18295399e-076, 8.96904553e-067, 1.18295467e-076,
        9.05795911e-043],
       [1.39642640e-076, 1.14327947e-071, 1.20736829e-153,
        6.81442337e-038],
       [2.44150975e-154, 4.85409878e-033, 6.03978209e-154,
        2.90315378e-057],
       [1.52043433e-052, 1.39804594e-076, 3.65482066e-086,
        3.75924203e+179],
       [3.84568576e-086, 2.62714652e+179, 1.03291348e-047,
        1.21134863e-099],
       [2.52456666e-052, 1.26931766e-076, 4.91222856e-062,
        9.15604169e-072]])

列表的表示形式在第一个和最后一个位置包含方括号。您只需删除它们,就可以得到适合
np的字符串。fromstring

arr = np.fromstring(train['feature'][2].strip().strip('[]')
它应提供:

array([6.12958628e-062, 6.03978204e-154, 6.38419835e-067, 8.24229531e-072,
       1.04032262e-042, 4.25805009e-086, 2.78795575e+179, 3.65482021e-086,
       3.76241284e+179, 8.54516547e-072, 1.21290955e-099, 9.72378324e-072,
       1.18295532e-076, 1.42737056e-071, 9.15162000e-072, 1.24635843e-047,
       1.39642642e-076, 1.45109844e-047, 1.28195170e-153, 1.57347469e-076,
       6.03832533e-154, 1.31291338e-047, 2.46065208e-091, 1.65747229e-076,
       4.56342568e-072, 6.75219849e-067, 4.45912655e-086, 2.79176962e+179,
       3.66964585e-062, 1.21157186e-099, 6.76930177e-043, 1.21089518e-099,
       9.72378280e-072, 1.18295268e-076, 4.27554979e-033, 1.39642638e-076,
       3.50744645e-033, 1.31607698e-259, 1.79779973e-052, 1.28195170e-153,
       1.24583535e-047, 6.03686863e-154, 6.38565290e-067, 5.40166782e-067,
       1.38240632e-047, 1.52040406e-052, 1.65716760e-047, 5.06235594e-086,
       3.59968426e+179, 5.60088410e-067, 1.21268023e-099, 1.91554389e-076,
       1.18295399e-076, 8.96904553e-067, 1.18295467e-076, 9.05795911e-043,
       1.39642640e-076, 1.14327947e-071, 1.20736829e-153, 6.81442337e-038,
       2.44150975e-154, 4.85409878e-033, 6.03978209e-154, 2.90315378e-057,
       1.52043433e-052, 1.39804594e-076, 3.65482066e-086, 3.75924203e+179,
       3.84568576e-086, 2.62714652e+179, 1.03291348e-047, 1.21134863e-099,
       2.52456666e-052, 1.26931766e-076, 4.91222856e-062, 9.15604169e-072])
带有
(76,)
形状

您只需对其进行
整形
即可获得2D numpy阵列:

arr.reshape(19,4)
array([[6.12958628e-062, 6.03978204e-154, 6.38419835e-067,
        8.24229531e-072],
       [1.04032262e-042, 4.25805009e-086, 2.78795575e+179,
        3.65482021e-086],
       [3.76241284e+179, 8.54516547e-072, 1.21290955e-099,
        9.72378324e-072],
       [1.18295532e-076, 1.42737056e-071, 9.15162000e-072,
        1.24635843e-047],
       [1.39642642e-076, 1.45109844e-047, 1.28195170e-153,
        1.57347469e-076],
       [6.03832533e-154, 1.31291338e-047, 2.46065208e-091,
        1.65747229e-076],
       [4.56342568e-072, 6.75219849e-067, 4.45912655e-086,
        2.79176962e+179],
       [3.66964585e-062, 1.21157186e-099, 6.76930177e-043,
        1.21089518e-099],
       [9.72378280e-072, 1.18295268e-076, 4.27554979e-033,
        1.39642638e-076],
       [3.50744645e-033, 1.31607698e-259, 1.79779973e-052,
        1.28195170e-153],
       [1.24583535e-047, 6.03686863e-154, 6.38565290e-067,
        5.40166782e-067],
       [1.38240632e-047, 1.52040406e-052, 1.65716760e-047,
        5.06235594e-086],
       [3.59968426e+179, 5.60088410e-067, 1.21268023e-099,
        1.91554389e-076],
       [1.18295399e-076, 8.96904553e-067, 1.18295467e-076,
        9.05795911e-043],
       [1.39642640e-076, 1.14327947e-071, 1.20736829e-153,
        6.81442337e-038],
       [2.44150975e-154, 4.85409878e-033, 6.03978209e-154,
        2.90315378e-057],
       [1.52043433e-052, 1.39804594e-076, 3.65482066e-086,
        3.75924203e+179],
       [3.84568576e-086, 2.62714652e+179, 1.03291348e-047,
        1.21134863e-099],
       [2.52456666e-052, 1.26931766e-076, 4.91222856e-062,
        9.15604169e-072]])