Python 如何计算csv列中存在的数字的频率_Python_Pandas_Csv

Python 如何计算csv列中存在的数字的频率

python pandas csv

Python 如何计算csv列中存在的数字的频率,python,pandas,csv,Python,Pandas,Csv,我想计算CSV文件列中存在的位数。现在这些是我的密码。我能够获取行中的数字，但我只想知道每行中是否有数字，如果是，则返回1，否则返回0。以及计算行中存在多少个数字 news=pd.read_csv("news.csv") news['numbers']= news['STORY'].str.extract(r'([\d:]+)') //this gives the digits itself mynews.csv示例 ID STORY 1 The theme unde

我想计算CSV文件列中存在的位数。现在这些是我的密码。我能够获取行中的数字，但我只想知道每行中是否有数字，如果是，则返回1，否则返回0。以及计算行中存在多少个数字

news=pd.read_csv("news.csv")

news['numbers']= news['STORY'].str.extract(r'([\d:]+)') //this gives the digits itself

mynews.csv示例

ID      STORY
 1       The theme underlined 3 key messages. 1 of it is..
 2       14th February is a Valentines Day
 3       Today is Monday

我想要的输出

ID      STORY                                               existnumbers     howmanynumbers
 1       The theme underlined 3 key messages. 1 of it is..     1                 2
 2       14th February is a Valentines Day                     1                 1
 3       Today is Monday                                       0                 0

看。可以这样完成：

news["howmany"] = news["STORY"].str.count(r"\d+")
news["existnumbers"] = news["howmany"] != 0

请注意，

existnumbers

这里是一个布尔字段，其中

True

表示在

故事

字符串中至少找到了一个数字。如果需要整数字段，可以按如下方式进行转换：

news["existnumbers"] = news["existnumbers"].astype(int)

要将非整数匹配为单个数字，可以使用：

news["howmany"] = news["STORY"].str.count(r"\d+(\.\d+)?")

输出：

existnumbers

作为一个布尔列会更清晰..str.count计算每个数字，例如，在2012年，有….，只有1个数字我已经更新了正则表达式-这将把2012作为字符串中的单个数字计算。例如：10.1%也可能只有1个数字吗？是的，您可以将正则表达式更改为使用

\d+（\.\d+）

，它可以选择匹配小数点和其他数字。我会避免不必要地使用

apply

——最好使用更高效的向量化操作。

import pandas as pd
from io import StringIO


data = StringIO("""
id  STORY
1  The theme underlined 2013 key messages. 1 of it is
2  14th February is a Valentines Day
3  Today is Monday
""")


df = pd.read_csv(data, sep='  ', engine='python')

df['howmanynumbers'] = df['STORY'].str.count('(\d+)')
df['existnumbers'] = df['howmanynumbers'].apply(lambda x: 1 if x > 0 else 0)

   id                                              STORY  howmanynumbers  existnumbers
0   1  The theme underlined 2013 key messages. 1 of i...               2             1
1   2                  14th February is a Valentines Day               1             1
2   3                                    Today is Monday               0             0