Python 在列中添加值_Python_Add

Python 在列中添加值

python

Python 在列中添加值,python,add,Python,Add,我有一个数据帧df，我想在cast和genres列中添加“/” 因此每个单元格包含3'/' id movie cast genres runtime 1 Furious a/b/c/d a/b 23 2 Minions a/b/c a/b/c 55 3 Mission a/b a 67 4 Kingsman a/b/c/d a/b/c/d 23 5 Sta

我有一个数据帧df，我想在cast和genres列中添加“/” 因此每个单元格包含3'/'

id  movie      cast      genres  runtime
1   Furious    a/b/c/d   a/b        23
2   Minions    a/b/c     a/b/c      55
3   Mission    a/b       a          67
4   Kingsman   a/b/c/d   a/b/c/d    23
5   Star Wars  a         a/b/c      45

所以，它的输出如下所示

id  movie      cast      genres  runtime
1   Furious    a/b/c/d   a/b//      23
2   Minions    a/b/c/    a/b/c/     55
3   Mission    a/b//     a///       67
4   Kingsman   a/b/c/d   a/b/c/d    23
5   Star Wars  a///      a/b/c/     45

对每列中的每个元素使用此函数来更新它们

def update_string(string):
    total_occ = 3 #total no. of occurrences of character '/' 
    for element in string: # for each element,
        if element == "/": # if there is '/', decrease 'total_occ'
            total_occ=total_occ-1;
    for i in range(total_occ): # add remaining no. of '/' at the end
        string+="/"
    return string

x = "a/b"    
print(update_string(x))

输出为：

a/b//

您可以按

进行拆分，用空字符串填充结果列表，直到其大小为4，然后再次使用

进行连接

用于更改整个列中的值

试试这个：

将熊猫作为pd导入
从io导入StringIO
df=pd.read\u csv（StringIO（““”id电影演员类型运行时
1愤怒的a/b/c/d a/b 23
2名随从a/b/c a/b/c 55
3特派团a/b a 67
4金斯曼a/b/c/d a/b/c/d 23
5《星球大战》a/b/c 45“”，sep=r“\s\s+”）
def pad_单元（值）：
部分=值。拆分（“/”）
部件+=[“”]*（4个部件）
返回“/”连接（部件）
df[“铸造”]=df[“铸造”].应用（焊盘单元）
df[“类型”]=df[“类型”]。应用（填充单元格）
打印（df）

以下是一种定义自定义函数的方法：

def add_values(df, *cols):
    for col in cols:
        # amount of "/" to add at each row
        c = df[col].str.count('/').rsub(3)
        # translate the above to as many "/" as required
        ap = [i * '/' for i in c.tolist()]
        # Add the above to the corresponding column
        df[col] = [i + j for i,j in zip(df[col], ap)]
    return df

add_values(df, 'cast', 'genres')

   id     movie     cast   genres  runtime
0   1   Furious  a/b/c/d    a/b//       23
1   2   Minions   a/b/c/   a/b/c/       55
2   3   Mission    a/b//     a///       67
3   4  Kingsman  a/b/c/d  a/b/c/d       23
4   5  StarWars     a///   a/b/c/       45

In [217]: df
Out[217]: 
   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d      a/b       23
1   2    Minions    a/b/c    a/b/c       55
2   3    Mission      a/b        a       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star Wars        a    a/b/c       45

In [218]: from itertools import chain, zip_longest

In [219]: def ensure_slashes(x):
     ...:     return ''.join(chain.from_iterable(zip_longest(x.split('/'), '///', fillvalue='')))
     ...: 
     ...: 

In [220]: df[['cast','genres']] = df[['cast','genres']].applymap(ensure_slashes)

In [221]: df
Out[221]: 
   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d    a/b//       23
1   2    Minions   a/b/c/   a/b/c/       55
2   3    Mission    a/b//     a///       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star Wars     a///   a/b/c/       45

给你：

=^^=

import pandas as pd
from io import StringIO

# create raw data
raw_data = StringIO("""
id movie cast genres runtime
1 Furious a/b/c/d a/b 23
2 Minions a/b/c a/b/c 55
3 Mission a/b a 67
4 Kingsman a/b/c/d a/b/c/d 23
5 Star_Wars a a/b/c 45
""")

# load data into data frame
df = pd.read_csv(raw_data, sep=' ')

# iterate over rows and add character
for index, row in df.iterrows():
    count_character_cast = row['cast'].count('/')
    if count_character_cast < 3:
        df.set_value(index, 'cast', row['cast']+'/'*(3-int(count_character_cast)))

    count_character_genres = row['genres'].count('/')
    if count_character_genres < 3:
        df.set_value(index, 'genres', row['genres'] + '/' * (3 - int(count_character_genres)))

具有以下特点和功能的简短解决方案：

def add_values(df, *cols):
    for col in cols:
        # amount of "/" to add at each row
        c = df[col].str.count('/').rsub(3)
        # translate the above to as many "/" as required
        ap = [i * '/' for i in c.tolist()]
        # Add the above to the corresponding column
        df[col] = [i + j for i,j in zip(df[col], ap)]
    return df

add_values(df, 'cast', 'genres')

   id     movie     cast   genres  runtime
0   1   Furious  a/b/c/d    a/b//       23
1   2   Minions   a/b/c/   a/b/c/       55
2   3   Mission    a/b//     a///       67
3   4  Kingsman  a/b/c/d  a/b/c/d       23
4   5  StarWars     a///   a/b/c/       45

In [217]: df
Out[217]: 
   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d      a/b       23
1   2    Minions    a/b/c    a/b/c       55
2   3    Mission      a/b        a       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star Wars        a    a/b/c       45

In [218]: from itertools import chain, zip_longest

In [219]: def ensure_slashes(x):
     ...:     return ''.join(chain.from_iterable(zip_longest(x.split('/'), '///', fillvalue='')))
     ...: 
     ...: 

In [220]: df[['cast','genres']] = df[['cast','genres']].applymap(ensure_slashes)

In [221]: df
Out[221]: 
   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d    a/b//       23
1   2    Minions   a/b/c/   a/b/c/       55
2   3    Mission    a/b//     a///       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star Wars     a///   a/b/c/       45

应用的关键功能是：

def ensure_slashes(x):
    return ''.join(chain.from_iterable(zip_longest(x.split('/'), '///', fillvalue='')))

好的，我们的想法是创建一个函数来完成必要的工作，并将其应用于想要的列：

该函数将用空字符串替换当前的斜杠，并在单元格中创建字符串的zip和正好包含3个元素的常量斜杠列表

其结果是该拉链的各元素的浓缩，以及它的工作原理：）

输出：

id  movie       cast    genres  runtime
1   furious     a/b/c/  a/b//   23
2   Mininons    a/b/c/  a/b/c/  55
3   mission     a/b//   a///    67
4   Kingsman    a/b/c/  a/b/c/  23
5   star Wars   a///    a/b/c/  45

分享您编写的代码，并解释代码的错误。这显示了你的努力。这看起来像作业/家庭作业问题。你应该先试试自己，然后问自己什么时候卡住了。谢谢，这是我想要的完美解决方案。