Python 3.x 在列表的字典上循环并更新相应的列_Python 3.x_Pandas_Dataframe_For Loop_If Statement

Python 3.x 在列表的字典上循环并更新相应的列

python-3.x pandas dataframe for-loop if-statement

Python 3.x 在列表的字典上循环并更新相应的列,python-3.x,pandas,dataframe,for-loop,if-statement,Python 3.x,Pandas,Dataframe,For Loop,If Statement,我有一个df和字典列表，如下所示 Date Tea_Good Tea_bad coffee_good coffee_bad 2020-02-01 3 1 10 7 2020-02-02 3 1 10 7 2020-02-03

我有一个

df

和字典列表，如下所示

 Date                  Tea_Good       Tea_bad    coffee_good      coffee_bad
2020-02-01             3              1           10                7
2020-02-02             3              1           10                7
2020-02-03             3              1           10                7
2020-02-04             3              1           10                7
2020-02-05             6              1           10                7
2020-02-06             6              2           10                11
2020-02-07             6              2           5                 11
2020-02-08             6              2           5                 11
2020-02-09             9              2           5                 11
2020-02-10             9              2           4                 11
2020-02-11             9              2           4                 11   
2020-02-12             9              2           4                 11         
2020-02-13             9              2           4                 11 
2020-02-14             9              2           4                 11

dict

是

rf = {
"tea": 
    [
      {
          "type": "linear",
          "from": "2020-02-01T20:00:00.000Z",
          "to": "2020-02-03T20:00:00.000Z",
          "days":3,
          "coef":[0.1,0.1,0.1,0.1,0.1,0.1],
          "case":"bad"
      },
      {
          "type": "polynomial",
          "from": "2020-02-08T20:00:00.000Z",
          "to": "2020-02-10T20:00:00.000Z",
          "days":3,
          "coef":[0.1,0.1,0.1,0.1,0.1,0.1],
          "case":"good"
      }],
"coffee": [
          {
              "type": "quadratic",
              "from": "2020-02-01T20:00:00.000Z",
              "to": "2020-02-10T20:00:00.000Z",
              "days": 10,
              "coef": [0.1, 0.1, 0.1, 0.1, 0.1, 0.1],
              "case":"good"
          },
          {
              "type": "constant",
              "from": "2020-02-11T20:00:00.000Z",
              "to": "2020-02-13T20:00:00.000Z",
              "days": 5,
              "coef": [0.1, 0.1, 0.1, 0.1, 0.1, 0.1],
              "case":"bad"
          }]}

说明：

字典包含两个键

1. "tea"
2. "coffee"

根据键值，我想更新

df

的列

1. Which column?
If key == "tea" and "case" == "bad" update the Tea_bad column

2. When? 
"from": "2020-02-01T20:00:00.000Z",
"to": "2020-02-03T20:00:00.000Z"

3. How?
if "type": "linear",
when  "from": "2020-02-01T20:00:00.000Z"
t = 0,
a0 = coef[0]
a1 = coef[1]
a2 = coef[2]
a3 = coef[3]
a4 = coef[4]
a5 = coef[5]

df.loc[(df['Date'] >= start_date) & (df['Date'] <= end_date), 'Tea_bad'] = a0 + a1 * t.

1。哪个栏目？
如果键==“tea”和“case”==“bad”，则更新tea\U bad列
2.什么时候
“从”：“2020-02-01T20:00:00.000Z”，
“收件人”：“2020-02-03T20:00:00.000Z”
3.怎么用？
如果“类型”：“线性”，
当“开始”：“2020-02-01T20:00:00.000Z”
t=0，
a0=coef[0]
a1=系数[1]
a2=系数[2]
a3=系数[3]
a4=系数[4]
a5=系数[5]
df.loc[（df['Date']>=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date）]=开始日期）&（df['['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']=开始日期）&（df['Date']使用：
结果:
# rf_unser_input(df, rf)

         Date  Tea_Good  Tea_bad  coffee_good  coffee_bad  days
0  2020-02-01       3.0      1.0         10.0         7.0     1
1  2020-02-02       3.0      0.3          0.3         7.0     2
2  2020-02-03       3.0      0.4          0.4         7.0     3
3  2020-02-04       3.0      0.5          0.5         0.3     4
4  2020-02-05      12.0      1.0          3.1         0.4     5
5  2020-02-06      13.0      2.0          4.3         0.5     6
6  2020-02-07       6.0      2.0          5.7         0.6     7
7  2020-02-08       6.0      2.0          7.3        11.0     8
8  2020-02-09       6.3      2.0          9.1        11.0     9
9  2020-02-10      36.4      2.0         11.1        11.0    10
10 2020-02-11     136.5      2.0         13.3        11.0    11
11 2020-02-12       9.0      2.0          4.0        11.0    12
12 2020-02-13       9.0      2.0          4.0        11.0    13
13 2020-02-14       9.0      2.0          4.0        11.0    14

一种解决方案是循环字典并使用apply：
df.Date=pd.to_datetime（df.Date）
df=df.set_索引（'Date'，drop=True）
df['Period']=[（日期-df.index[0]）。df.index中日期的天数]
对于键，在rf.items（）中使用val：
对于val中的元素：
type_method=elem.get（'type'）
col_name=f'{key.capitalize（）}{elem.get（“case”）}'
date\u from=pd.to\u datetime（elem.get（'from'））
date_to=pd.to_datetime（elem.get（'to'））
a0、a1、a2、a3、a4、a5=要素获取（'coef'）
掩码日期=（df.index>=日期从）&（df.index
def rf_user_input(df, req_obj):
    df = df.sort_values('Date')
    df['days'] = (df['Date'] - df.at[0, 'Date']).dt.days + 1

    cols, df.columns = df.columns, df.columns.str.lower()

    for category in ("tea", "coffee"):
        if category not in req_obj.keys():
            continue

        for params_obj in req_obj[category]:
            case = params_obj['case']
            kind = '{}_{}'.format(category, case)

            start_date = pd.to_datetime(params_obj['from'], format='%Y-%m-%dT%H:%M:%S.%fZ')
            end_date = pd.to_datetime(params_obj['to'], format='%Y-%m-%dT%H:%M:%S.%fZ')
            label, coef, n_days = params_obj['type'], params_obj['coef'], params_obj['days']

            # Additional n_days code - Start
            first_date = df['date'].min()
            period_days = (start_date - first_date).days
            # Additional n_days code - End

            # Checking 'start_date' , 'end_date' and 'n_days' conditions

            # If the start_date and end_date is null return the calibration df as it is
            if (start_date == 0) and (end_date == 0):
                return df.set_axis(cols, axis=1)

            if (start_date == 0) and (end_date != 0) and (n_days == 0):
                return df.set_axis(cols, axis=1)

            if (start_date != 0) and (end_date == 0) and (n_days == 0):
                return df.set_axis(cols, axis=1)

            # if start date, end date and n_days are non zero then consider start date and n_days
            if (start_date != 0) and (end_date != 0) and (n_days != 0):
                end_date = start_date + pd.Timedelta(days=n_days)

            if (start_date != 0) and (end_date != 0) and (n_days == 0):
                n_days = (end_date - start_date)

            if (start_date != 0) and (end_date == 0) and (n_days != 0):
                end_date = start_date + pd.Timedelta(days=n_days)

            if (start_date == 0) and (end_date != 0) and (n_days != 0):
                start_date = end_date - pd.Timedelta(days=n_days)

            if (n_days != 0) and (start_date != 0):
                end_date = start_date + pd.Timedelta(days=n_days)

            # If the start_date and end_date is null return the calibration df as it is

            if len(coef) == 6:
                a0, a1, a2, a3, a4, a5 = coef
                mask = df['date'].between(start_date, end_date)

                if label == 'constant':
                    if kind in ('tea_good', 'tea_bad', 'coffee_good', 'coffee_bad'):
                        df.loc[mask, kind] = a0 + df['days'] - period_days

                elif label == 'linear':
                    if kind in ('tea_good', 'tea_bad', 'coffee_good', 'coffee_bad'):
                        df.loc[mask, kind] = a0 + \
                            (a1 * ((df['days']) - period_days))

                # Quadratic
                elif label == 'quadratic':
                    if kind in ('tea_good', 'tea_bad', 'coffee_good', 'coffee_bad'):
                        df.loc[mask, kind] = a0 + (a1 * ((df['days']) - period_days)) + (
                            a2 * ((df['days']) - period_days) ** 2)

                # Polynomial
                elif label == 'polynomial':
                    if kind in ('tea_good', 'tea_bad', 'coffee_good', 'coffee_bad'):
                        df.loc[mask, kind] = a0 + (
                            a1 * ((df['days']) - period_days)) + (a2 * (
                                (df['days']) - period_days) ** 2) + (a3 * (
                                    (df['days']) - period_days) ** 3) + (a4 * (
                                        (df['days']) - period_days) ** 4) + (a5 * ((df['days']) - period_days) ** 5)

                # Exponential
                elif label == 'exponential':
                    if kind in ('tea_good', 'tea_bad', 'coffee_good', 'coffee_bad'):
                        df.loc[mask, kind] = np.exp(a0)

                # Calibration File
                elif label == 'calibration_file':
                    pass
            else:
                raise Exception(
                    'Coefficients index do not match. All values of coefficients should be passed')

    return df.set_axis(cols, axis=1)

# rf_unser_input(df, rf)

         Date  Tea_Good  Tea_bad  coffee_good  coffee_bad  days
0  2020-02-01       3.0      1.0         10.0         7.0     1
1  2020-02-02       3.0      0.3          0.3         7.0     2
2  2020-02-03       3.0      0.4          0.4         7.0     3
3  2020-02-04       3.0      0.5          0.5         0.3     4
4  2020-02-05      12.0      1.0          3.1         0.4     5
5  2020-02-06      13.0      2.0          4.3         0.5     6
6  2020-02-07       6.0      2.0          5.7         0.6     7
7  2020-02-08       6.0      2.0          7.3        11.0     8
8  2020-02-09       6.3      2.0          9.1        11.0     9
9  2020-02-10      36.4      2.0         11.1        11.0    10
10 2020-02-11     136.5      2.0         13.3        11.0    11
11 2020-02-12       9.0      2.0          4.0        11.0    12
12 2020-02-13       9.0      2.0          4.0        11.0    13
13 2020-02-14       9.0      2.0          4.0        11.0    14

            Tea_good  Tea_bad  Coffee_good  Coffee_bad  Period
Date                                                          
2020-02-01       3.0      1.0         10.0         7.0       0
2020-02-02       3.0      0.2          0.3         7.0       1
2020-02-03       3.0      0.3          0.7         7.0       2
2020-02-04       3.0      1.0          1.3         7.0       3
2020-02-05       6.0      1.0          2.1         7.0       4
2020-02-06       6.0      2.0          3.1        11.0       5
2020-02-07       6.0      2.0          4.3        11.0       6
2020-02-08       6.0      2.0          5.7        11.0       7
2020-02-09    3744.9      2.0          7.3        11.0       8
2020-02-10    6643.0      2.0          9.1        11.0       9
2020-02-11       9.0      2.0          4.0        11.0      10
2020-02-12       9.0      2.0          4.0        11.1      11
2020-02-13       9.0      2.0          4.0        12.1      12
2020-02-14       9.0      2.0          4.0        11.0      13