Python 如何在numpy数组中拆分字符串？_Python_String_Numpy

Python 如何在numpy数组中拆分字符串？

python string numpy

Python 如何在numpy数组中拆分字符串？,python,string,numpy,Python,String,Numpy,我有下表：由于“location”列中有重复的状态，因此我尝试从location中删除该状态，使其仅具有城市名称 year location state success 2009 New York, NY NY 1 2009 New York, NY NY 1 2009 Chicago, IL IL 1 2009 New York, NY NY 1 2009 Boston, MA MA 1 2009 Long B

我有下表：

由于“location”列中有重复的状态，因此我尝试从location中删除该状态，使其仅具有城市名称

year    location    state   success
2009    New York, NY    NY  1
2009    New York, NY    NY  1
2009    Chicago, IL IL  1
2009    New York, NY    NY  1
2009    Boston, MA  MA  1
2009    Long Beach, CA  CA  1
2009    Atlanta, GA GA  1

我尝试了以下代码：

x = KS_clean.column(1)
np.chararray.split(x, ',')

如何拆分字符串，使结果仅包含城市名称，如下所示：

array('New York', 'New York', 'Chicago', ...,)

这样我就可以把它放回桌子里了

抱歉，这是一个基本问题，但我对python还不熟悉，还在学习。谢谢

我认为您需要首先使用

数据帧

（例如：

最后一个（如有必要）转换为numpy数组：

arr = df.values
print (arr)
[[2009 'New York' 'NY' 1]
 [2009 'New York' 'NY' 1]
 [2009 'Chicago' 'IL' 1]
 [2009 'New York' 'NY' 1]
 [2009 'Boston' 'MA' 1]
 [2009 'Long Beach' 'CA' 1]
 [2009 'Atlanta' 'GA' 1]]

您的数据看起来像熊猫数据帧，而不是numpy数组。请检查。这是一个pandas数据帧，但当我提取列（var x）并检查其类型时，它会显示numpy.ndarrayHow获得数据帧？看起来很奇怪。选择列时，必须获得一个

系列

，而不是任何numpy。

df['location'] = df['location'].str.split(', ').str[0]
print (df)
   year    location state  success
0  2009    New York    NY        1
1  2009    New York    NY        1
2  2009     Chicago    IL        1
3  2009    New York    NY        1
4  2009      Boston    MA        1
5  2009  Long Beach    CA        1
6  2009     Atlanta    GA        1

arr = df.values
print (arr)
[[2009 'New York' 'NY' 1]
 [2009 'New York' 'NY' 1]
 [2009 'Chicago' 'IL' 1]
 [2009 'New York' 'NY' 1]
 [2009 'Boston' 'MA' 1]
 [2009 'Long Beach' 'CA' 1]
 [2009 'Atlanta' 'GA' 1]]