Python pandas.date\u范围精确频率参数_Python_Python 3.x_Pandas

Python pandas.date\u范围精确频率参数

python python-3.x pandas

Python pandas.date\u范围精确频率参数,python,python-3.x,pandas,Python,Python 3.x,Pandas,我正在尝试生成采样频率为5120 Hz的pandas.DateTimeIndex。这给出了增量=0.0001953125秒的周期如果尝试使用pandas.date\u range（），则需要将频率（参数freq）指定为str或pandas.DateOffset。第一种方法只能处理高达1ns的精度，后者的性能比str更差，错误更严重使用字符串时，我的构造如下所示： freq=str(int(increment*1e9))+'N') 它在不到2秒钟的时间内执行270 Mb的文件，但在大约150

我正在尝试生成采样频率为5120 Hz的

pandas.DateTimeIndex

。这给出了

增量=0.0001953125

秒的周期

如果尝试使用

pandas.date\u range（）

，则需要将频率（参数

freq

）指定为

str

或

pandas.DateOffset

。第一种方法只能处理高达1ns的精度，后者的性能比

str

更差，错误更严重

使用字符串时，我的构造如下所示：

freq=str(int(increment*1e9))+'N')

它在不到2秒钟的时间内执行270 Mb的文件，但在大约1500µs的300万条记录之后，我出现了一个错误（在DateTimeIndex中）

使用

pandas.DateOffset

时，如下所示

freq=pd.DateOffset(seconds=increment)

它在1分14秒内解析文件，但有大约1秒的错误

我还尝试使用

starttime + pd.to_timedelta(cumulativeTimes, unit='s')

这个总和也需要很长的时间才能完成，但它是唯一一个在结果

DateTimeIndex

中没有错误的总和

我如何才能实现高效生成的

DateTimeIndex

，保持准确性？

我使用了一个纯numpy实现来解决这个问题：

accuracy = 'ns'

relativeTime = np.linspace(
        offset,
        offset + (periods - 1) * increment,
        periods)

def unit_correction(u):
    if u is 's':
        return 1e0
    elif u is 'ms':
        return 1e3
    elif u is 'us':
        return 1e6
    elif u is 'ns':
        return 1e9

# Because numpy only knows ints as its date datatype, 
# convert to accuracy.
return (np.datetime64(starttime) 
        + (relativeTime*unit_correction(accuracy)).astype(
            "timedelta64["+accuracy+"]"
            )
        )

（这是针对感兴趣的人的github pull请求：）

我使用了一个纯numpy实现来解决这个问题：

accuracy = 'ns'

relativeTime = np.linspace(
        offset,
        offset + (periods - 1) * increment,
        periods)

def unit_correction(u):
    if u is 's':
        return 1e0
    elif u is 'ms':
        return 1e3
    elif u is 'us':
        return 1e6
    elif u is 'ns':
        return 1e9

# Because numpy only knows ints as its date datatype, 
# convert to accuracy.
return (np.datetime64(starttime) 
        + (relativeTime*unit_correction(accuracy)).astype(
            "timedelta64["+accuracy+"]"
            )
        )

（这是对感兴趣的人的github pull请求：）

我想我使用下面的函数可以得到类似的结果（尽管它只使用纳秒精度）：

我想我通过下面的函数得到了类似的结果（尽管它只使用纳秒精度）：

如果熊猫身上似乎有虫子，你能在网站上发表一篇关于它的文章吗？@joris我以为熊猫身上有虫子，但我不这么认为了。只是熊猫的准确度是1ns，其他一切都是由于舍入误差。如果熊猫身上似乎有一个bug，你能在上打开一个关于它的问题吗？@joris我以为熊猫身上有一个bug，但我不这么认为了。这只是因为熊猫的精度为1ns，其他一切都是由于舍入误差造成的。