Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/webpack/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:如何选择包含一年中特定月份的列?_Python_Pandas_Dataframe_Timestamp - Fatal编程技术网

Python:如何选择包含一年中特定月份的列?

Python:如何选择包含一年中特定月份的列?,python,pandas,dataframe,timestamp,Python,Pandas,Dataframe,Timestamp,我有一个如下所示的数据帧: +------------+-----------+-----+-----+----------+----------+-----+------------+----------+----------+-----+------------+ | State_Name | City_Name | ID1 | ID2 | 1/1/2020 | 1/2/2020 | ... | 12/31/2020 | 1/1/2021 | 1/2/2021 | ... | 12/31/2

我有一个如下所示的数据帧:

+------------+-----------+-----+-----+----------+----------+-----+------------+----------+----------+-----+------------+
| State_Name | City_Name | ID1 | ID2 | 1/1/2020 | 1/2/2020 | ... | 12/31/2020 | 1/1/2021 | 1/2/2021 | ... | 12/31/2021 |
+------------+-----------+-----+-----+----------+----------+-----+------------+----------+----------+-----+------------+
| ...        | ...       | ... | ... | ...      | ...      | ... | ...        | ...      | ...      | ... | ...        |
+------------+-----------+-----+-----+----------+----------+-----+------------+----------+----------+-----+------------+
| ...        | ...       | ... | ... | ...      | ...      | ... | ...        | ...      | ...      | ... | ...        |
+------------+-----------+-----+-----+----------+----------+-----+------------+----------+----------+-----+------------+
从2020年1月1日到2021年12月31日,我有很多专栏。如何选择包含一年中特定月份的列?例如,如果我希望选择包含2021年7月数据的列,我可以使用名称“7/1/2021”、“7/2/2021”、“7/31/2021”来子集列


非常感谢你的帮助

我建议将所有非datetime列存储到
多索引
,将所有其他列转换为
日期时间

print (df)
  State_Name City_Name ID1 ID2  7/1/2021  1/7/2020  7/05/2021  1/1/2021  \
0          a         b   s   d         7         8          5         6   

   1/2/2021  12/31/2021  
0         3           8  

df = df.set_index(['State_Name','City_Name','ID1','ID2'])
df.columns = pd.to_datetime(df.columns)
对于选定的2021年7月,请使用:

或通过以下方式按月份进行比较:

或:


您可以使用
filter
方法:

df.filter(regex='7/\d{1,2}/2021', axis=1)

一个很好的解决方案是使用
melt
将各种日期列转换为单个日期列中的值

例如:

# make dataframe from sample data
data = {
    "State_Name": ['state1', 'state2'],
    "City_Name": ['city1', 'city2'],
    "ID1": ['ID1_A', 'ID1_B'],
    "ID2": ['ID2_A', 'ID2_B'],
    "1/1/2020": ['dog', 'cat'],
    "1/2/2020": ['house', 'mouse']
}
df = pd.DataFrame(data)

# melt date columns into a row
melted_df = df.melt(
    id_vars=["State_Name", "City_Name", "ID1", "ID2"],
    var_name="date")
df
如下所示:

+------------+-----------+-----+-----+----------+----------+-----+------------+----------+----------+-----+------------+
| State_Name | City_Name | ID1 | ID2 | 1/1/2020 | 1/2/2020 | ... | 12/31/2020 | 1/1/2021 | 1/2/2021 | ... | 12/31/2021 |
+------------+-----------+-----+-----+----------+----------+-----+------------+----------+----------+-----+------------+
| ...        | ...       | ... | ... | ...      | ...      | ... | ...        | ...      | ...      | ... | ...        |
+------------+-----------+-----+-----+----------+----------+-----+------------+----------+----------+-----+------------+
| ...        | ...       | ... | ... | ...      | ...      | ... | ...        | ...      | ...      | ... | ...        |
+------------+-----------+-----+-----+----------+----------+-----+------------+----------+----------+-----+------------+
州名 城市名称 ID1 ID2 1/1/2020 1/2/2020 0 状态1 城市1 ID1_A ID2_A 狗 房子 1. 状态2 城市2 ID1_B ID2_B 猫 老鼠
1/1/2020
是否为字符串?
df.filter(regex='^7')
,但仅适用于字符串。顺便说一句,不是我的反对票。不如:
df.columns=df.columns.astype(str)
然后使用
filter
?是的,但您将删除名称和Id列…我只是将它们更改为
str
类型。我不是吗?
df.filter(regex='7/\d{1,2}/2021', axis=1)
# make dataframe from sample data
data = {
    "State_Name": ['state1', 'state2'],
    "City_Name": ['city1', 'city2'],
    "ID1": ['ID1_A', 'ID1_B'],
    "ID2": ['ID2_A', 'ID2_B'],
    "1/1/2020": ['dog', 'cat'],
    "1/2/2020": ['house', 'mouse']
}
df = pd.DataFrame(data)

# melt date columns into a row
melted_df = df.melt(
    id_vars=["State_Name", "City_Name", "ID1", "ID2"],
    var_name="date")