来自JIRA的Python CSV,具有重复的Sprint头

来自JIRA的Python CSV,具有重复的Sprint头,python,python-3.x,csv,jira,Python,Python 3.x,Csv,Jira,我正在使用python 3.7.5 我有一个CSV文件,我从Jira实例中得到它,以便查看哪个问题在哪个sprint中完成。 Jira跟踪问题所在的每个sprint,因此如果导出CSV,您将获得多个sprint标题,其中包含以下数据: Issue key,Issue Type,Status,Sprint,Sprint,Sprint,Sprint OLS-526,Story,Done,Sprint #16,Sprint #17,Sprint #18,Sprint #19 OLS-871,Story

我正在使用python 3.7.5

我有一个CSV文件,我从Jira实例中得到它,以便查看哪个问题在哪个sprint中完成。 Jira跟踪问题所在的每个sprint,因此如果导出CSV,您将获得多个sprint标题,其中包含以下数据:

Issue key,Issue Type,Status,Sprint,Sprint,Sprint,Sprint
OLS-526,Story,Done,Sprint #16,Sprint #17,Sprint #18,Sprint #19
OLS-871,Story,Done,Sprint #18,Sprint #28,,
OLS-165,Story,Done,Sprint 1,Sprint 3,Sprint #18,Sprint #19
OLS-868,Story,Done,Sprint #28,,,
import csv
with open('../OLS-tix2.csv', newline='') as csvfile:
  reader = csv.DictReader(csvfile)
  for row in reader:
    print(row['Sprint'])
Sprint #19

Sprint #19


我需要的是确定问题交付的Sprint,以便在Sprint列中找到最右边的部分,这样我就可以计算每个Sprint中实际完成了多少问题

我尝试过使用默认的python“csv”和DictReader,如下所示:

Issue key,Issue Type,Status,Sprint,Sprint,Sprint,Sprint
OLS-526,Story,Done,Sprint #16,Sprint #17,Sprint #18,Sprint #19
OLS-871,Story,Done,Sprint #18,Sprint #28,,
OLS-165,Story,Done,Sprint 1,Sprint 3,Sprint #18,Sprint #19
OLS-868,Story,Done,Sprint #28,,,
import csv
with open('../OLS-tix2.csv', newline='') as csvfile:
  reader = csv.DictReader(csvfile)
  for row in reader:
    print(row['Sprint'])
Sprint #19

Sprint #19


但是,只有最后一列Sprint中没有任何内容时,才会得到空格。由于上面的输出如下所示:

Issue key,Issue Type,Status,Sprint,Sprint,Sprint,Sprint
OLS-526,Story,Done,Sprint #16,Sprint #17,Sprint #18,Sprint #19
OLS-871,Story,Done,Sprint #18,Sprint #28,,
OLS-165,Story,Done,Sprint 1,Sprint 3,Sprint #18,Sprint #19
OLS-868,Story,Done,Sprint #28,,,
import csv
with open('../OLS-tix2.csv', newline='') as csvfile:
  reader = csv.DictReader(csvfile)
  for row in reader:
    print(row['Sprint'])
Sprint #19

Sprint #19



我可以只使用普通的csv阅读器,然后自己滚动,但我认为必须有一种更好的方式在python中实现这一点。

好的,所以我做了更多的研究,发现这可能是这项工作的一个好工具。 有很多例子,作为奖励,我可以使用and和进行飞行/旋转

这就是我最终为自己工作的原因:

import pandas as pd


csv_file = "../OLS-tix.csv" # where the file is at
ols_df = pd.read_csv(csv_file)
finish_sprint_col = 'finish_sprint' # the column to put the actual Sprint thie issue was finished in
ols_df[finish_sprint_col] = "" # add the new blank column
sprints = ols_df.columns[ols_df.columns.str.contains('Sprint')] # get all the headers that contain the word sprint as they will be Sprint, Sprint.1 ... Sprint.N
for i,row in ols_df.iterrows():
  if not ols_df.at[i,"Status"] == "Done": # we only want to do this for "Done" Issues
    continue
  finish_sprint = False
  for header in sprints: # go through all the sprint cells for this row and get the last not empty one.
    if not pd.isnull(ols_df.loc[i, header]):
      finish_sprint = ols_df.loc[i, header]
  if finish_sprint:
    ols_df.at[i,finish_sprint_col] = finish_sprint

# get number of issue finished per sprint.
dones = ols_df[(ols_df.Status == "Done") & (ols_df['Issue Type'] == "Story") ].pivot_table(index=["finish_sprint"],values=["Issue key"], aggfunc=[pd.Series.nunique])

这可能是一种更简单的方法,但目前看来似乎很有效…

好的,所以我四处看了看,发现这可能是一个很好的工具。 有很多例子,作为奖励,我可以使用and和进行飞行/旋转

这就是我最终为自己工作的原因:

import pandas as pd


csv_file = "../OLS-tix.csv" # where the file is at
ols_df = pd.read_csv(csv_file)
finish_sprint_col = 'finish_sprint' # the column to put the actual Sprint thie issue was finished in
ols_df[finish_sprint_col] = "" # add the new blank column
sprints = ols_df.columns[ols_df.columns.str.contains('Sprint')] # get all the headers that contain the word sprint as they will be Sprint, Sprint.1 ... Sprint.N
for i,row in ols_df.iterrows():
  if not ols_df.at[i,"Status"] == "Done": # we only want to do this for "Done" Issues
    continue
  finish_sprint = False
  for header in sprints: # go through all the sprint cells for this row and get the last not empty one.
    if not pd.isnull(ols_df.loc[i, header]):
      finish_sprint = ols_df.loc[i, header]
  if finish_sprint:
    ols_df.at[i,finish_sprint_col] = finish_sprint

# get number of issue finished per sprint.
dones = ols_df[(ols_df.Status == "Done") & (ols_df['Issue Type'] == "Story") ].pivot_table(index=["finish_sprint"],values=["Issue key"], aggfunc=[pd.Series.nunique])
这可能是一种更简单的方法,但目前看来似乎有效