Python 修剪numpy数组中的值部分_Python_Python 3.x_Numpy

Python 修剪numpy数组中的值部分

python python-3.x numpy

Python 修剪numpy数组中的值部分,python,python-3.x,numpy,Python,Python 3.x,Numpy,我只需要数组中每个值的前10个字符以下是阵列： array(['2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000', '2018-06-30

我只需要数组中每个值的前10个字符

以下是阵列：

array(['2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000',
   '2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000',
   '2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000',
   '2018-06-30T00:00:00.000000000', '2018-09-30T00:00:00.000000000']

我想写一段代码，让我：

array(['2018-06-30','2018-06-30'   .... etc

以下是最新消息：我的代码是：

x = np.array(df4['per_end_date'])
x

输出为：

array(['2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000',
   '2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000',
   '2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000',
   '2018-06-30T00:00:00.000000000', '2018-09-30T00:00:00.000000000',
   '2018-09-30T00:00:00.000000000', '2018-09-30T00:00:00.000000000', etc

我只想要数组中每个值的前10个字符。以下代码为我提供了错误索引器：标量变量的索引无效

x = np.array([y[:9] for y in x])

在python中使用列表是一项非常基本的任务

import numpy
x = numpy.array(['2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000',
           '2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000',
           '2018-06-30T00:00:00.000000000', '2018-06-30T00:00:00.000000000',
           '2018-06-30T00:00:00.000000000', '2018-09-30T00:00:00.000000000'])
numpy.array([y[:10] for y in x])
# array(['2018-06-30', '2018-06-30', '2018-06-30', '2018-06-30',
#        '2018-06-30', '2018-09-30'], 
#        dtype='|S10')

有关更多信息，您应该阅读一些文档。

虽然

numpy

并不总是操作字符串的最佳方式，但您可以将此操作矢量化，并且一如既往，矢量化函数应该优先于迭代

设置

好吧，我知道了

df4['per_end_date'].dtype

输出：

dtype('<M8[ns]')

输出：

array(['2018-06-30', '2018-06-30', '2018-06-30', '2018-06-30',
   '2018-06-30', '2018-06-30', '2018-06-30', '2018-09-30',
   '2018-09-30', '2018-09-30', '2018-09-30', '2018-09-30',
   '2018-09-30', '2018-09-30', '2018-09-30', '2018-09-30', etc

当你能弄明白的时候，那就太棒了。：）

你不必在切片中写入0，虽然我在“0:9”中删除了“0”，但得到了相同的错误。你能显示你有错误的数据吗？对于你所举的例子来说，它工作得很好provided@user3672037请不要为您的回答编辑我的答案，而是评论或编辑您的问题…此

数组中必须有其他元素，例如空字符串。请检查一下。

arr = np.repeat(arr, 10000)

%timeit np.array([y[:10] for y in arr])
48.6 ms ± 961 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit
np.frombuffer(
    arr.view((str, 1 )).reshape(arr.shape[0], -1)[:, :10].tostring(),
    dtype=(str,10)
)

6.87 ms ± 311 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit np.array(arr,dtype= 'datetime64[D]')
44.9 ms ± 2.93 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

df4['per_end_date'].dtype

dtype('<M8[ns]')

x = np.array(df4['per_end_date'],dtype= 'datetime64[D]')
x

array(['2018-06-30', '2018-06-30', '2018-06-30', '2018-06-30',
   '2018-06-30', '2018-06-30', '2018-06-30', '2018-09-30',
   '2018-09-30', '2018-09-30', '2018-09-30', '2018-09-30',
   '2018-09-30', '2018-09-30', '2018-09-30', '2018-09-30', etc