Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/287.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:如果行中只有一个单词,则替换dataframe/列中的字符串_Python_Python 3.x_Pandas - Fatal编程技术网

Python:如果行中只有一个单词,则替换dataframe/列中的字符串

Python:如果行中只有一个单词,则替换dataframe/列中的字符串,python,python-3.x,pandas,Python,Python 3.x,Pandas,我有相当混乱的数据,我正在尝试用或空字符串替换可能只包含1个单词或字符串的行 以下是原始数据: df = pd.DataFrame({'some_text': [ 'I enjoy read Mark Twain\'s Books', 'Library is very useful', '/', '\\', '/ /', '', 'I enjoy read Mark Twain\'s

我有相当混乱的数据,我正在尝试用或空字符串替换可能只包含1个单词或字符串的行

以下是原始数据:

df = pd.DataFrame({'some_text': [
        'I enjoy read Mark Twain\'s Books',
        'Library is very useful',
        '/',
        '\\',
        '/ /',
        '',
        'I enjoy read Mark Twain\'s Books',
        'an',
        'the',
        'Books are interesting'
]})
我试过这个:这是删除行。我不想删除行,只要替换它就行了

count = df['some_text'].str.split().str.len()
df[~(count==1)]
所需的最终产出:

I enjoy read Mark Twain's Books
Library is very useful


/ /

I enjoy read Mark Twain's Books


Books are interesting

可以在不使用遮罩的情况下对列应用转换:

df['replaced_text'] = df['some_text'].apply(lambda x: '' if len(x.strip().split()) == 1  else x) 
print(df.to_string())
df
>>

                         some_text                    replaced_text
0  I enjoy read Mark Twain's Books  I enjoy read Mark Twain's Books
1           Library is very useful           Library is very useful
2                                /                                 
3                                \                                 
4                              / /                              / /
5                                                                  
6  I enjoy read Mark Twain's Books  I enjoy read Mark Twain's Books
7                               an                                 
8                              the                                 
9            Books are interesting            Books are interesting

与您所应用的非常类似,lambda函数检查每个长度等于1的带空格字符串,并将其替换为。

您可以在不带掩码的情况下对列应用转换:

df['replaced_text'] = df['some_text'].apply(lambda x: '' if len(x.strip().split()) == 1  else x) 
print(df.to_string())
df
>>

                         some_text                    replaced_text
0  I enjoy read Mark Twain's Books  I enjoy read Mark Twain's Books
1           Library is very useful           Library is very useful
2                                /                                 
3                                \                                 
4                              / /                              / /
5                                                                  
6  I enjoy read Mark Twain's Books  I enjoy read Mark Twain's Books
7                               an                                 
8                              the                                 
9            Books are interesting            Books are interesting

与您所应用的非常类似,lambda函数检查每个字符串,其中删除了长度等于1的空白,并将其替换为。

您可以在此处使用一个简单的正则表达式:

df['new_text'] = df['some_text'].str.replace('^\S+$','');
>>> df
                         some_text                         new_text
0  I enjoy read Mark Twain's Books  I enjoy read Mark Twain's Books
1           Library is very useful           Library is very useful
2                                /                                 
3                                \                                 
4                              / /                              / /
5                                                                  
6  I enjoy read Mark Twain's Books  I enjoy read Mark Twain's Books
7                               an                                 
8                              the                                 
9            Books are interesting            Books are interesting

您可以在此处使用简单的正则表达式:

df['new_text'] = df['some_text'].str.replace('^\S+$','');
>>> df
                         some_text                         new_text
0  I enjoy read Mark Twain's Books  I enjoy read Mark Twain's Books
1           Library is very useful           Library is very useful
2                                /                                 
3                                \                                 
4                              / /                              / /
5                                                                  
6  I enjoy read Mark Twain's Books  I enjoy read Mark Twain's Books
7                               an                                 
8                              the                                 
9            Books are interesting            Books are interesting

使用您所做的实现,而不是删除行,而是按如下方式指定一个新值:

count = df['some_text'].str.split().str.len()
df[count == 1] = ""

使用您所做的实现,而不是删除行,而是按如下方式指定一个新值:

count = df['some_text'].str.split().str.len()
df[count == 1] = ""

请注意,此正则表达式不会替换只有一个单词但也有前导或尾随空格的字符串,但可以根据需要进行修改。请注意,此正则表达式不会替换只有一个单词但也有前导或尾随空格的字符串,但可以根据需要进行修改。