Python 我有月度数据，但将数据分为特定月份，然后是该月份的所有年份，如何返回按时间顺序排列的数据集？_Python_Arrays_Numpy_Jupyter

Python 我有月度数据，但将数据分为特定月份，然后是该月份的所有年份，如何返回按时间顺序排列的数据集？

python arrays numpy

Python 我有月度数据，但将数据分为特定月份，然后是该月份的所有年份，如何返回按时间顺序排列的数据集？,python,arrays,numpy,jupyter,Python,Arrays,Numpy,Jupyter,我有一个4d数据集，其中第一个轴是技术上的月份，第二个轴是该月份的所有年份，然后是空间坐标。因此，1月（0）有27个值，代表1月的27年我的目标是重建时间序列，使之成为第一年的1月，第一年的2月。。。第二年的1月，第二年的2月，等等。只是一个正常的月度时间序列。但是，我对数据集进行重新排序的功能是错误的，不能同时对正确的月份和年份进行排序有人能看出我的功能有什么问题吗？我的新“合并月份”数组的形状正确，但它没有正确地重新排列数组“数据”： # 4d data set, with 12 mo

我有一个4d数据集，其中第一个轴是技术上的月份，第二个轴是该月份的所有年份，然后是空间坐标。因此，1月（0）有27个值，代表1月的27年

我的目标是重建时间序列，使之成为第一年的1月，第一年的2月。。。第二年的1月，第二年的2月，等等。只是一个正常的月度时间序列。但是，我对数据集进行重新排序的功能是错误的，不能同时对正确的月份和年份进行排序

有人能看出我的功能有什么问题吗？我的新“合并月份”数组的形状正确，但它没有正确地重新排列数组“数据”：

# 4d data set, with 12 month, 27 years (so each month [jan - dec] has 27 points):
data = np.random.rand(12,27,281,375)


#function to try and get data back into chronological order (1st month - year 1, 2nd month - year 1...)
def merge_months(split_data):
    merged_months = []
    for month in range(split_data.shape[0]):
        for year in range(split_data.shape[1]): 
            print(month, year)
            merged_months.append(split_data[month][year])
    return merged_months

merged_months = np.array(merge_months(data))

print(merged_months.shape)
(324,281,374)

我期望从重新排序的输出中得到以下内容的数组：

（第1年1月的第一个值，第2个值是第1年2月，……第13个值是第2年1月，第14个值是第2年2月……以此类推）

有人善意地建议：

num_years = 27
num_months = 12
height = 281
width = 375

data = np.random.rand(num_months, num_years, height, width)

# The reshape is equivalent to .reshape(num_years * num_months, height, width)
data_ord = np.swapaxes(data, 0, 1).reshape(-1, *data.shape[-2:])

# Check for february, year 5, pixel (34, 34)
month_idx = 1  # february
year_idx = 4  # year 5
height_idx = 34
width_idx = 34

feb5px1 = data[month_idx, year_idx, height_idx, width_idx]
feb5px2 = data_ord[year_idx * num_months + month_idx, height_idx, width_idx]

但我想将其应用于所有网格点，而不仅仅是一个

我试过：

test = []
num_months = 12

for yr_idx in range(27):
    for month_idx in range(12):
        test.apppend(d[year_idx*num_months+month_idx, :,:])

但是我得到一个错误，它说：“list”对象没有属性“apppend”

我认为您只需要交换前两个轴，然后通过重塑将它们展平，请参见下面的代码：

num_years = 27
num_months = 12
height = 281
width = 375

data = np.random.rand(num_months, num_years, height, width)

# The reshape is equivalent to .reshape(num_years * num_months, height, width)
data_ord = np.swapaxes(data, 0, 1).reshape(-1, *data.shape[-2:])

# Check for february, year 5, pixel (34, 34)
month_idx = 1  # february
year_idx = 4  # year 5
height_idx = 34
width_idx = 34

feb5px1 = data[month_idx, year_idx, height_idx, width_idx]
feb5px2 = data_ord[year_idx * num_months + month_idx, height_idx, width_idx]

assert feb5px1 == feb5px2

# And for all pixels this month
feb5_1 = data[month_idx, year_idx]
feb5_2 = data_ord[year_idx * num_months + month_idx]

assert np.allclose(feb5_1, feb5_2)

最好显示您想要的预期数据帧，因为我们更容易看到该表。这有助于澄清吗？

np.swapaxes

确实交换了两个轴：这里轴0和1被交换，也就是月和年轴，因此月变成轴1，年变成轴0

data.shape

返回一个元组，该元组包含

data

，

data的每个轴的长度。shape[-2:

仅返回此元组的最后两个元素，即

（高度、宽度）

，星形展开运算符

展开元组（

.Reformate（-1，*（34，34））

隐式变为

.Reformate（-1，34，34）

）。

.reformate

中的

-1

告诉numpy自动推断此维度的大小。我在代码中添加了一条注释，以使

.reforme

更显式地等效。我不明白，它已经在为所有网格点做这项工作了。对于numpy，几乎总是有一种比for循环更好（更有效）的方法