Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/364.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/clojure/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python中最简单的功能映射器_Python_Pandas - Fatal编程技术网

python中最简单的功能映射器

python中最简单的功能映射器,python,pandas,Python,Pandas,我正在尝试使用python3制作一个最简单的功能映射器。两个目的:获得最佳性能并了解如何编程python: 这是我的代码,它不起作用: import pandas as pd source = pd.DataFrame({'Country' : ['USA', 'USA', 'Russia','USA'], 'City' : ['New-York1', 'New-York', 'Sankt-Petersburg', 'New-York']}) #trim

我正在尝试使用python3制作一个最简单的功能映射器。两个目的:获得最佳性能并了解如何编程python:

这是我的代码,它不起作用:

import pandas as pd
source = pd.DataFrame({'Country' : ['USA', 'USA', 'Russia','USA'], 
                  'City' : ['New-York1', 'New-York', 'Sankt-Petersburg', 'New-York']})

#trim column value selecting first two symbols
def s_trim(x):
    return x[:2]

#make new column from two selecting first two symbols from each
def s_trim_concat(x,y):
    return '%s-%s' % (x[:2],y[:2])

features = [
    ('trim',['Country'],s_trim),
    ('trim1',['Country','City'],s_trim_concat),
    ('trim2',['City','Country'],s_trim_concat)
    ]

for feature_name, columns, func in features:
    source[feature_name] = source[columns].apply(func, axis=1)

print(source)
更新:现在代码可以工作了,但我不得不使函数复杂化,所以我仍然在寻找好的解决方案,允许使用简单函数而不需要内部类型转换:

import pandas as pd
source = pd.DataFrame({'Country' : ['USA', 'USA', 'Russia','USA'], 
                  'City' : ['New-York1', 'New-York', 'Sankt-Petersburg', 'New-York']})

#trim column value selecting first two symbols
def s_trim(x):
    return x.str[:2]

#make new column from two selecting first two symbols from each
def s_trim_concat(row):
    x = row[0]
    y = row[1]
    return '%s-%s' % (x[:2],y[:2])

features = [
    ('trim',['Country'],s_trim),
    ('trim1',['Country','City'],s_trim_concat),
    ('trim2',['City','Country'],s_trim_concat)
    ]

for feature_name, columns, func in features:
    if len(columns) == 1:
        source[feature_name] = source[columns].apply(func)
    else:
        source[feature_name] = source[columns].apply(func, axis=1)
print(source)

我认为问题在于,您正在将一个列表传递到s_trim_concat,而不是两个单独的参数

您能否提供此示例的最终输出的示例。首先,我需要澄清从s_trim_concat返回的值应该与哪个键关联

更新

试试这个:

import pandas as pd
source = pd.DataFrame({'Country' : ['USA', 'USA', 'Russia','USA'], 
                  'City' : ['New-York1', 'New-York', 'Sankt-Petersburg', 'New-York']})

#trim column value selecting first two symbols
def s_trim(x):
    return x[:2]

#make new column from two selecting first two symbols from each
def s_trim_concat(x,y):
    return '%s-%s' % (x[:2],y[:2])

features = [
    ('trim',['Country'],s_trim),
    ('trim1',['Country','City'],s_trim_concat),
    ('trim2',['City','Country'],s_trim_concat)
    ]

for feature_name, columns, func in features:
    source[feature_name] = apply(func, columns)

print(source)

也许我已经找到了一个解决方案:

import pandas as pd
source = pd.DataFrame({'Country' : ['USA', 'USA', 'Russia','USA'], 
                  'City' : ['New-York1', 'New-York', 'Sankt-Petersburg', 'New-York']})

#trim column value selecting first two symbols
def s_trim(x):
    return x.str[:2]

#make new column from two selecting first two symbols from each
def s_trim_concat(x,y):
    return '%s-%s' % (x[:2],y[:2])

features = [
    ('trim',['Country'],s_trim),
    ('trim1',['Country','City'],s_trim_concat),
    ('trim2',['City','Country'],s_trim_concat)
    ]

for feature_name, columns, func in features:
    source[feature_name] = source[columns].apply(
        func if len(columns) == 1 
        else lambda x: func(x[0],x[1]), axis=1)
print(source)

这应该做什么?在解决分类或回归任务之前,我想添加新的转换列,即清理源数据或对它们进行规范化。在我的示例代码中,我希望s_trim to cat column to two symbol,s_trim_concat-从两个符号中生成一个列。也就是说,对于“USA”,“纽约”要获得“US-Ne”,我将获得“trim”作为“US”,“US”,“Ru”,“US”和“trim1”作为“US-Ne”,“US-Ne”,“Ru-Sa”,“US-Ne”。在我的帖子中,我指出了错误的列名,更正了。apply函数在您的解决方案中不起作用-python找不到它