Html table 如何在使用css选择器刮表时过滤转义序列？_Html Table_Scrapy_Css Selectors

Html table 如何在使用css选择器刮表时过滤转义序列？

scrapy

Html table 如何在使用css选择器刮表时过滤转义序列？,html-table,scrapy,css-selectors,Html Table,Scrapy,Css Selectors,我正在尝试使用Scrapy中的CSS选择器刮表。我使用的方法是在item对象中将行逐行刮到单个scrapy.Field（）中但是，刮取的数据在表中的每个其他元素之间包含一个“\n\t\t”元素。如何在刮削过程中移除此项。我可以对数据进行后处理，但我想了解发生了什么我的分析方法： def parse_product(self, response): l = ItemLoader(item = KdramaItem(), response = r

我正在尝试使用Scrapy中的CSS选择器刮表。我使用的方法是在item对象中将行逐行刮到单个scrapy.Field（）中

但是，刮取的数据在表中的每个其他元素之间包含一个“\n\t\t”元素。如何在刮削过程中移除此项。我可以对数据进行后处理，但我想了解发生了什么

我的分析方法：

 def parse_product(self, response):

    l = ItemLoader(item = KdramaItem(),
                   response = response,
                   )
    l.add_value('url', response.meta['source_url'])
    table_loader = l.nested_css('table')
    table_loader.add_css('table', 'tr ::text')

    yield l.load_item()

部分输出：

"url": ["http://www.koreandrama.org/angels-last-mission-love/"], "table": ["\n\t\t", "Date", "\n\t\t", "Ep", "\n\t\t", "TNmS", "\n\t\t", "TNmS", "\n\t\t", "AGB", "\n\t\t", "AGB", "\n\t", "\n\t\t", "\u00a0", "\n\t\t", "\u00a0", "\n\t\t", "Nationwide", "\n\t\t", "Seoul", "\n\t\t", "Nationwide", "\n\t\t", "Seoul", "\n\t", "\n\t\t",

嗯，

\n\t\t

不是一个元素，它只是一个无害的空白。@MrLister我是说它是表列表中的一个元素