类型错误：'；在'；需要字符串作为左操作数，而不是Python中的生成器_Python

类型错误：'；在'；需要字符串作为左操作数，而不是Python中的生成器

python

类型错误：'；在'；需要字符串作为左操作数，而不是Python中的生成器,python,Python,我正在尝试解析推特数据我的数据形状如下： 59593936 3061025991 null null <d>2009-08-01 00:00:37</d> <s><a href="http://help.twitter.com/index.php?pg=kb.page&id=75" rel="nofollow">txt</a></s> <t>honda just

我正在尝试解析推特数据

我的数据形状如下：

59593936 3061025991 null null <d>2009-08-01 00:00:37</d> <s>&lt;a href="http://help.twitter.com/index.php?pg=kb.page&amp;id=75" rel="nofollow"&gt;txt&lt;/a&gt;</s> <t>honda just recalled 440k accords...traffic around here is gonna be light...win!!</t> ajc8587 15 24 158 -18000 0 0 <n>adrienne conner</n> <ud>2009-07-23 21:27:10</ud> <t>eastern time (us &amp; canada)</t> <l>ga</l>
22020233 3061032620 null null <d>2009-08-01 00:01:03</d> <s>&lt;a href="http://alexking.org/projects/wordpress" rel="nofollow"&gt;twitter tools&lt;/a&gt;</s> <t>new blog post: honda recalls 440k cars over airbag risk http://bit.ly/2wsma</t> madcitywi 294 290 9098 -21600 0 0 <n>madcity</n> <ud>2009-02-26 15:25:04</ud> <t>central time (us &amp; canada)</t> <l>madison, wi</l>

我认为我的代码有很多问题，但第一个错误如下：

回溯（最近一次调用last）：文件“health_related_tweets.py”，第23行，in-if关键字in-line:TypeError:“in”需要字符串作为左操作数，而不是生成器

请帮帮我

原因是

keywords=get\u keywords（…）

返回一个生成器。从逻辑上考虑，关键字应该是所有关键字的列表。对于此列表中的每个关键字，您需要检查它是否在tweet/行中

示例代码：

keywords = get_keywords('./related_keywords.txt', 'r')
has_keyword = False
for keyword in keywords:
  if keyword in line:
    has_keyword = True
    break
if has_keyword:
  # Your code here (for the case when the line has at least one keyword)

（如果行中的关键字为：，则上面的代码将替换为

）我遇到另一个错误。（回溯（最后一次调用）：文件“health_related_tweets.py”，第25行，在关键字中的关键字：文件“health_related_tweets.py”，第13行，在get_keywords yield line.split（）.lower（）AttributeError:“list”对象没有属性“lower”）我认为我需要转换关键字和tweet，这些关键字和tweet将以小写进行解析。所以我在代码中加了“.lower”。但它会出错。我该怎么修呢？这也是有道理的。line.split（）将提供一个（字符串）列表，lower（）处理字符串。你能给我一个例子吗？related_keywords.txt。related_keywords.txt包含这样的词：牙医抑郁症安慰剂X射线X射线HIV血液Preassure流感（这些词用enter分隔。我的意思是每个词，如HIV和X射线，或短语，如Boold Preassure，都写在一行中。因此我用“.split（）”将其拆分）我把示例关键字放在主文本中！谢谢你的帮助！伟大的理想情况下，您不需要split函数，因为您不想将像“血压”这样的单词拆分为[“血压”，“压力]]。你在寻找文本中的整个单词。我认为你需要使用正则表达式。它是从文本中提取数据时使用的工具。见模块re
Depression
Placebo
X-rays
X-ray
HIV
Blood preasure
Flu
Fever
Oral Health
Antibiotics
Diabetes
Mellitus
Genetic disorders

keywords = get_keywords('./related_keywords.txt', 'r')
has_keyword = False
for keyword in keywords:
  if keyword in line:
    has_keyword = True
    break
if has_keyword:
  # Your code here (for the case when the line has at least one keyword)