Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/332.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何使用python webscraping从datalayer.push获取数据_Python_Web Scraping_Scrapy - Fatal编程技术网

如何使用python webscraping从datalayer.push获取数据

如何使用python webscraping从datalayer.push获取数据,python,web-scraping,scrapy,Python,Web Scraping,Scrapy,我的代码是: # init scrapy selector response = Selector(text=content) json_data = json.loads(script.get() for script in (re.findall(r'dataLayer\.push\(([^)]+)'),response.css('script::text'))).group(1) print(json_dat

我的代码是:

        # init scrapy selector
        response = Selector(text=content)
        
        json_data = json.loads(script.get() for script in (re.findall(r'dataLayer\.push\(([^)]+)'),response.css('script::text'))).group(1)
        print(json_data)
    
    # debug data extraction logic
    HummartScraper.parse_product(HummartScraper, '')
" 输出错误为:

Traceback (most recent call last):
  File "hummart2.py", line 86, in parse_product
    json_data = json.loads(script.get() for script in (re.findall(r'dataLayer\.push\(([^)]+)'),response.css('script::text'))).group(1)
TypeError: findall() missing 1 required positional argument: 'string'

为什么会出现此错误。

对于单个
数据层

data_layer = response.css('script::text').re_first(r'dataLayer\.push\(([^)]+)')
data = json.loads(data_layer)

您可以使用
response.css(…).re()
获取匹配列表。

对于单个
数据层

data_layer = response.css('script::text').re_first(r'dataLayer\.push\(([^)]+)')
data = json.loads(data_layer)

您可以使用
response.css(…).re()
获取匹配项列表。

但这会导致以下类型的错误:

 File "hummart2.py", line 88, in parse_product
    data = json.loads(data_layer_raw)[1]
  File "/home/danish-khan/miniconda3/lib/python3.7/json/__init__.py", line 341, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not NoneType

但这给了我这种类型的错误:

 File "hummart2.py", line 88, in parse_product
    data = json.loads(data_layer_raw)[1]
  File "/home/danish-khan/miniconda3/lib/python3.7/json/__init__.py", line 341, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not NoneType

但它给了我这种类型的错误:你有什么错误?文件“hummart2.py”,第88行,在parse_product data=json.loads(data_layer_raw)[1]文件/home/danish khan/miniconda3/lib/python3.7/json/u init_uuuuu.py”,第341行,在loads raise TypeError中(f'JSON对象必须是str,bytes或bytearray,'TypeError:JSON对象必须是str,bytes或bytearray,而不是nonetype看起来你的正则表达式是错误的。你能给我看一个源页面吗?这是源页面:'view source:'但它给我这种类型的错误:你有什么错误?文件“hummart2.py”,第88行,在loads raise TypeError中的第341行,解析产品数据=json.loads(数据层原始)[1]文件“/home/danish khan/miniconda3/lib/python3.7/json/_uuuuuuuuinit_uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu(f'JSON对象必须是str、bytes或bytearray,'TypeError:JSON对象必须是str、bytes或bytearray,而不是NoneTypes看起来您的正则表达式是错误的。您能给我看一个源页面吗?这是源页面:'view source:'