Python 在列中添加值

Python 在列中添加值,python,add,Python,Add,我有一个数据帧df,我想在cast和genres列中添加“/” 因此每个单元格包含3'/' id movie cast genres runtime 1 Furious a/b/c/d a/b 23 2 Minions a/b/c a/b/c 55 3 Mission a/b a 67 4 Kingsman a/b/c/d a/b/c/d 23 5 Sta

我有一个数据帧df,我想在cast和genres列中添加“/” 因此每个单元格包含3'/'

id  movie      cast      genres  runtime
1   Furious    a/b/c/d   a/b        23
2   Minions    a/b/c     a/b/c      55
3   Mission    a/b       a          67
4   Kingsman   a/b/c/d   a/b/c/d    23
5   Star Wars  a         a/b/c      45
所以,它的输出如下所示

id  movie      cast      genres  runtime
1   Furious    a/b/c/d   a/b//      23
2   Minions    a/b/c/    a/b/c/     55
3   Mission    a/b//     a///       67
4   Kingsman   a/b/c/d   a/b/c/d    23
5   Star Wars  a///      a/b/c/     45

对每列中的每个元素使用此函数来更新它们

def update_string(string):
    total_occ = 3 #total no. of occurrences of character '/' 
    for element in string: # for each element,
        if element == "/": # if there is '/', decrease 'total_occ'
            total_occ=total_occ-1;
    for i in range(total_occ): # add remaining no. of '/' at the end
        string+="/"
    return string

x = "a/b"    
print(update_string(x))
输出为:

a/b//

您可以按
/
进行拆分,用空字符串填充结果列表,直到其大小为4,然后再次使用
/
进行连接

用于更改整个列中的值

试试这个:

将熊猫作为pd导入
从io导入StringIO
df=pd.read\u csv(StringIO(““”id电影演员类型运行时
1愤怒的a/b/c/d a/b 23
2名随从a/b/c a/b/c 55
3特派团a/b a 67
4金斯曼a/b/c/d a/b/c/d 23
5《星球大战》a/b/c 45“”,sep=r“\s\s+”)
def pad_单元(值):
部分=值。拆分(“/”)
部件+=[“”]*(4个部件)
返回“/”连接(部件)
df[“铸造”]=df[“铸造”].应用(焊盘单元)
df[“类型”]=df[“类型”]。应用(填充单元格)
打印(df)

以下是一种定义自定义函数的方法:

def add_values(df, *cols):
    for col in cols:
        # amount of "/" to add at each row
        c = df[col].str.count('/').rsub(3)
        # translate the above to as many "/" as required
        ap = [i * '/' for i in c.tolist()]
        # Add the above to the corresponding column
        df[col] = [i + j for i,j in zip(df[col], ap)]
    return df

add_values(df, 'cast', 'genres')

   id     movie     cast   genres  runtime
0   1   Furious  a/b/c/d    a/b//       23
1   2   Minions   a/b/c/   a/b/c/       55
2   3   Mission    a/b//     a///       67
3   4  Kingsman  a/b/c/d  a/b/c/d       23
4   5  StarWars     a///   a/b/c/       45
In [217]: df
Out[217]: 
   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d      a/b       23
1   2    Minions    a/b/c    a/b/c       55
2   3    Mission      a/b        a       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star Wars        a    a/b/c       45

In [218]: from itertools import chain, zip_longest

In [219]: def ensure_slashes(x):
     ...:     return ''.join(chain.from_iterable(zip_longest(x.split('/'), '///', fillvalue='')))
     ...: 
     ...: 

In [220]: df[['cast','genres']] = df[['cast','genres']].applymap(ensure_slashes)

In [221]: df
Out[221]: 
   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d    a/b//       23
1   2    Minions   a/b/c/   a/b/c/       55
2   3    Mission    a/b//     a///       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star Wars     a///   a/b/c/       45
给你:

=^^=

import pandas as pd
from io import StringIO

# create raw data
raw_data = StringIO("""
id movie cast genres runtime
1 Furious a/b/c/d a/b 23
2 Minions a/b/c a/b/c 55
3 Mission a/b a 67
4 Kingsman a/b/c/d a/b/c/d 23
5 Star_Wars a a/b/c 45
""")

# load data into data frame
df = pd.read_csv(raw_data, sep=' ')

# iterate over rows and add character
for index, row in df.iterrows():
    count_character_cast = row['cast'].count('/')
    if count_character_cast < 3:
        df.set_value(index, 'cast', row['cast']+'/'*(3-int(count_character_cast)))

    count_character_genres = row['genres'].count('/')
    if count_character_genres < 3:
        df.set_value(index, 'genres', row['genres'] + '/' * (3 - int(count_character_genres)))
具有以下特点和功能的简短解决方案:

def add_values(df, *cols):
    for col in cols:
        # amount of "/" to add at each row
        c = df[col].str.count('/').rsub(3)
        # translate the above to as many "/" as required
        ap = [i * '/' for i in c.tolist()]
        # Add the above to the corresponding column
        df[col] = [i + j for i,j in zip(df[col], ap)]
    return df

add_values(df, 'cast', 'genres')

   id     movie     cast   genres  runtime
0   1   Furious  a/b/c/d    a/b//       23
1   2   Minions   a/b/c/   a/b/c/       55
2   3   Mission    a/b//     a///       67
3   4  Kingsman  a/b/c/d  a/b/c/d       23
4   5  StarWars     a///   a/b/c/       45
In [217]: df
Out[217]: 
   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d      a/b       23
1   2    Minions    a/b/c    a/b/c       55
2   3    Mission      a/b        a       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star Wars        a    a/b/c       45

In [218]: from itertools import chain, zip_longest

In [219]: def ensure_slashes(x):
     ...:     return ''.join(chain.from_iterable(zip_longest(x.split('/'), '///', fillvalue='')))
     ...: 
     ...: 

In [220]: df[['cast','genres']] = df[['cast','genres']].applymap(ensure_slashes)

In [221]: df
Out[221]: 
   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d    a/b//       23
1   2    Minions   a/b/c/   a/b/c/       55
2   3    Mission    a/b//     a///       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star Wars     a///   a/b/c/       45
应用的关键功能是:

def ensure_slashes(x):
    return ''.join(chain.from_iterable(zip_longest(x.split('/'), '///', fillvalue='')))

好的,我们的想法是创建一个函数来完成必要的工作,并将其应用于想要的列:

该函数将用空字符串替换当前的斜杠,并在单元格中创建字符串的zip和正好包含3个元素的常量斜杠列表

其结果是该拉链的各元素的浓缩,以及它的工作原理:)

输出:

id  movie       cast    genres  runtime
1   furious     a/b/c/  a/b//   23
2   Mininons    a/b/c/  a/b/c/  55
3   mission     a/b//   a///    67
4   Kingsman    a/b/c/  a/b/c/  23
5   star Wars   a///    a/b/c/  45


分享您编写的代码,并解释代码的错误。这显示了你的努力。这看起来像作业/家庭作业问题。你应该先试试自己,然后问自己什么时候卡住了。谢谢,这是我想要的完美解决方案。