Python 3.x 将脚本转换为使用pandas和numpy的Python3.x后,openpyxl出现错误

Python 3.x 将脚本转换为使用pandas和numpy的Python3.x后,openpyxl出现错误,python-3.x,pandas,openpyxl,Python 3.x,Pandas,Openpyxl,大约一年前,我写了一个脚本,它只取了一列datetime值,并在系列中运行了一个窗口,根据可调整的时间维度确定最大的值“集总”。例如,给定一百万个日期时间值,相互间隔1秒、1分钟或1小时的条目的最大值是多少 问题是我的一台机器坏了,丢失了一些文档,特别是我正在使用的软件包的版本。我想我已经将代码更新为在3.x中执行,但现在出现的错误似乎表明pandas不再支持我尝试使用的包。我试过安装一些随机版本,更新pip等等,但运气不太好 确切的错误是,“UserWarning:目前不支持已安装的openp

大约一年前,我写了一个脚本,它只取了一列datetime值,并在系列中运行了一个窗口,根据可调整的时间维度确定最大的值“集总”。例如,给定一百万个日期时间值,相互间隔1秒、1分钟或1小时的条目的最大值是多少

问题是我的一台机器坏了,丢失了一些文档,特别是我正在使用的软件包的版本。我想我已经将代码更新为在3.x中执行,但现在出现的错误似乎表明pandas不再支持我尝试使用的包。我试过安装一些随机版本,更新pip等等,但运气不太好


确切的错误是,“UserWarning:目前不支持已安装的openpyxl”。使用>=1.61和我不确定您为什么会看到openpyxl相关的错误,但是如果您看到了,您应该更新您的Pandas版本。openpyxl中出现了一些重大变化,影响了从Pandas导出到Excel,但这些变化已经解决

import numpy as np
import pandas as pd

# Your original code was correct here. I assumed there will be a data column along with the timestamps.
df = pd.read_csv("ET.txt", parse_dates=["dt"])

# Construct a univariate `timeseries` instead of a single column dataframe as output by `read_csv`.
# You can think of a dataframe as a matrix with labelled columns and rows. A timeseries is more like
# an associative array, or labelled vector. Since we don't need a labelled column, we can use a simpler
# representation.
data = pd.Series(0, df.dt)  
print(data)
window_size = 1
buckets_sec = data.resample("1S", how="count").fillna(0)

# We have to shift the data back by the same number of samples as the window size. This is because `rolling_apply`
# uses the timestamp of the end of the period instead of the beginning. I assume you want to know when the most
# active period started, not when it ended. Finally, `dropna` will remove any NaN entries appearing in the warmup
# period of the sliding window (ie. it will output NaN for the first window_size-1 observations).
rolling_count = pd.rolling_apply(buckets_sec, window=window_size, func=np.nansum).shift(-window_size).dropna()
print(rolling_count.describe())

# Some interesting data massaging
# E.g. See how the maximum hit count over the specified sliding window evolves on an hourly
# basis:
seconds_max_hits = rolling_count.resample("S", how="max").dropna()

# Plot the frequency of various hit counts. This gives you an idea how frequently various
# hit counts occur.
seconds_max_hits.hist()

# Same on a daily basis
daily_max_hits = rolling_count.resample("S", how="max").dropna()