Python Beauty Soup |从java脚本中提取变量_Python_Regex_Web Scraping_Beautifulsoup - Fatal编程技术网

Python Beauty Soup |从java脚本中提取变量

python regex web-scraping

Python Beauty Soup |从java脚本中提取变量,python,regex,web-scraping,beautifulsoup,Python,Regex,Web Scraping,Beautifulsoup,打火机我正在使用BeautifulSoup从一个HTML页面中删除数据，该页面在表体下有几列请在模拟代码下面： from bs4 import BeautifulSoup import requests import urllib.request as urllib2 import re import json app_page = urllib2.urlopen(myUrl) soup = BeautifulSoup(app_page) print(soup.prettif

打火机

我正在使用BeautifulSoup从一个HTML页面中删除数据，该页面在表体下有几列

请在模拟代码下面：

from bs4 import BeautifulSoup

import requests

import urllib.request as urllib2

import re

import json


app_page = urllib2.urlopen(myUrl)

soup = BeautifulSoup(app_page)

print(soup.prettify())

data  = soup.find_all("script")[7]

data=re.sub("\n","",str(data))

print(data)

输出：

var-appsTableData=[[对文本使用.string
，然后使用str.replace

Ex:
data = soup.find_all("script")[7].string 
print(data.replace("var appsTableData=", ""))

[[<"<a href='Something'/>"]]

输出：
data = soup.find_all("script")[7].string 
print(data.replace("var appsTableData=", ""))

[[<"<a href='Something'/>"]]

[[使用beautifulsoup和重新编译

data = '''<script type="text/javascript">              var appsTableData=[[<"<a href='Something'/>"]]</script>'''
soup = BeautifulSoup(data, "html.parser")

withbs = soup.find('script', string=re.compile('var appsTableData'))
withbs = withbs.text.replace('var appsTableData=', '').strip()
print(withbs)

结果:
[[<"<a href='Something'/>"]]

[[您的问题很难理解。失败的原因是*？
很懒惰。将其更改为贪婪也不起作用，您需要一些信息来告诉它停止匹配的位置。




[regex]相关文章推荐



                                                        
Regex 正则表达式是否匹配指定分隔符之间的文本？（我自己就是拿不到）
regex 
Regex 多参数apache mod_重写正则表达式问题
regexmod-rewriteparameters 
Regex 正则表达式删除\“；从一个很棒的GString？
regexjsongrailsgroovy 
Regex 文件大小的Bash脚本正则表达式
regexbash 
Regex 正则表达式模式，以匹配字符串常量中的none或正好1
regex 
Regex AS3正则表达式，作为或组中的一个值进行检索
regexactionscript-3 
Regex 程序名检测
regexpowershell 
Regex 用于数字字符串替换的正则表达式
regex 
Regex 使用命令行查找和替换（组织）？
regexshellcommand-line 
Regex 为url匹配编写正则表达式
regexwordpressmod-rewrite 
RegEx查找具有大于sign-in属性的输入值
regex 
Regex 用于2种日期格式的正则表达式
regex 
Regex 在搜索中存储正则表达式&；替换VisualStudio的窗口
regexvisual-studio 
Regex 正则表达式替换。用（[^ x2B；$）替换什么
regex 
Regex 匹配正则表达式：仅在空格之间提取数字
regex 
Regex sas、正则表达式、数字、子字符串、prxchange
regexsas 
Countif url匹配regexp
regexgoogle-sheets 
Regex 使用beautifulsoup去除HTML中的脚本文本
regexpython-3.x 
Regex 关于两个字符串之间的findall（但忽略数字）
regexpython-3.xparsing 
Regex 正则表达式：仅在斜杠之间匹配路由参数，但可选结束斜杠
regex 
                                       





随机文章推荐



                                                        
Oracle11g 使用Toad 10.6.1.3在Oracle 11g中更改重复间隔时避免自动运行作业
oracle11g 
Oracle11g Liquibase generateChangelog为Oracle创建了错误的精度
oracle11g 
Oracle11g 计算Oracle Apex表格单元格的和
oracle11goracle-apex 
如何在oracle11g中创建数据库链接
oracle11g 
Oracle11g 仅在oracle 11g RAC中导出表的元数据
oracle11g 
Oracle11g 表正在变化，触发器/函数可能看不到它
oracle11g 
Oracle11g [1] ：ORA-01427:单行子查询返回多行
oracle11g 
Oracle11g 当我们在Oracle中使用全局索引和本地索引时？
oracle11g 
Oracle11g 如何在oracle sql中从字符串中删除字符类
oracle11g 
Oracle11g 如何将oracle数据库数据从我的pc迁移到virtualbox oracle数据库
oracle11gvirtualbox 
Oracle11g Oracle APEX中如何防止跨站点请求伪造
oracle11goracle-apex 
Oracle11g Oracle假脱机格式问题
oracle11g 
Oracle11g 表上的密码
< >我使用Oracle 11g，我想在Oracle中的表或其他对象上设置密码，这样就可以在Oracle？< /P> < P>中设置密码。如果密码的目的是保护数据，那么考虑一下。
db对象（表）上的角色/用户权限，或者您甚至可以加密/解密数据是
oracle11goracle10g 
在oracle11g中没有侦听器错误消息，我在我的计算机中未找到oracleoraClient11g_Home1TransListener服务
oracle11g


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
Python：如何分割文件？
									Python
							 
Python 对章节编号进行排序，如1.2.1或1.4.2.4
									Python
							 									Google App Engine
							 									Sorting
							 
Python 超时前未收到SNMP响应
									Python
							 									Snmp
							 
Python 创建日期变量，不考虑格式
									Python
							 									Python 2.7
							 
Python assertEqual或assertNotEqual取决于条件
									Python
							 									Python 2.7
							 
Python 错误：lxml.etree.xmlsyntaxer:应为'&燃气轮机'；
									Python
							 									Xml
							 									Xpath
							 
Python 从文件中的一行创建字典
									Python
							 									List
							 									File
							 									Dictionary
							 
&引用；积压工作“；在Python中通过列表的循环
									Python
							 									Python 3.x
							 									For Loop
							 									Memory
							 
Python-脚本与哪个版本兼容？
									Python
							 
在python中获取自定义对象的键值对
									Python
							 									Dictionary
							 
Python 在迭代之外使用计数器，忽略列表中的空行
									Python
							 									List
							 
Python 在同一语句中查找值并添加到相同的值
									Python
							 									Python 2.7
							 
Python 在多处理中生成单例类
									Python
							 									Python 2.7
							 
Python 以datetime月为单位的x、y值的Matplotlib直方图
									Python
							 									Pandas
							 									Numpy
							 									Datetime
							 									Matplotlib
							 
“那是什么？”；无效*<；未使用的>&引用；一些方法在CPython中的大小？
									Python
							 
Python 多幅图像每个通道的平均值
									Python
							 									Arrays
							 									Numpy
							 
Python 随机森林分类器的决策路径
									Python
							 									Machine Learning
							 									Scikit Learn
							 
bs4 python找不到文本
									Python
							 									Web Scraping
							 
Python ImportError:仅当从.bat文件运行时，缺少必需的依赖项['；numpy'；]
									Python
							 									Windows
							 									Batch File
							 
Python 缺少测量且仅已知位置的情况下的卡尔曼滤波预测
									Python
							 
使用Python 3.7+；要进行100k API调用，请使用asyncio并行进行100次调用
									Python
							 
spaCy&x27；s regex不同于Python'；s正则表达式
									Python
							 									Regex
							 
Python 如何使用pandas dataframes将子标题添加到html表中，以及如何将dataframe数据访问到html表中？
									Python
							 									Pandas
							 
Python 使用PyOpenGL和PyQt5转换对象时出错
									Python
							 									Python 3.x
							 									Opengl
							 
Python TypeError:函数（）获取了意外的关键字参数'；njobs'；
									Python
							 
Python 限制Azure机器学习管道使用的节点数
									Python
							 									Azure
							 
Python 什么是jsl标记
									Python
							 									Xml
							 									Web Scraping
							 
Python 为什么物流回归CV'；s.score（）与cross_val_分数不同？
									Python
							 									Scikit Learn
							 
Python通过REPL与阻塞循环交互的最佳实践
									Python
							 									Loops
							 
Python 在哪里找到要导入到项目中的Seq2SeqTrainer？
									Python
							 									Nlp
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Sed
Bluetooth
Firefox
Netbeans
C# 3.0
Cron
Java
Csv
Visual Studio 2008
Google Chrome
Rss
Button
Http
Sql Server 2012
Ada
Permissions
Oauth 2.0
Isabelle
Codenameone
Ip
Xml
Asp.net Mvc 5
Redis
Dojo
Pip
Amp Html
Sml
Ignite
Reference
Jquery Ui
Lucene
Selenium Webdriver
Racket
Activerecord
Ubuntu
Directx
Centos
Openstack
Julia
Directory
Compiler Construction
Floating Point
Datetime
Listview
Swiftui
Titanium
Assembly
Apache Flex
Jqgrid
Scikit Learn
Text
Concurrency
Terraform
Windows Installer
Angular
Cuda
Cobol
Abap
Iphone
Fiware
Windbg
Apache Nifi
Mapbox
Internet Explorer
Silverlight 4.0
Xampp
Anaconda
Markdown
Exchange Server
Ms Access
Couchbase
Spring Integration
Air
Nhibernate
Continuous Integration
Linker
Build
Virtual Machine
Glassfish
Dart
Cocoa Touch
Magento
Python
Camera
Stripe Payments
Macos
Wolfram Mathematica
Embedded
Asp.net Mvc 3
Loopbackjs
Asp.net Mvc
Xcode4
Autodesk Forge
Testing
Binary
Geolocation
Winforms
Ionic Framework
Active Directory
Go
Keyboard
Imagemagick
Twitter
Sql Server 2008
Websocket
Interface
Sharepoint
Ruby On Rails 4
Jaxb
Fullcalendar
Bootstrap 4
Ag Grid
Synchronization
Gtk
Spring Cloud
Nestjs
Gruntjs
Com
Coding Style
Silverstripe
Azure Ad B2c
Prolog
Login
Actionscript 3
Iis 7
Leaflet
Laravel 5
Azure Service Fabric
Inheritance
Https
Android Emulator
Safari
Io
Groovy
Rest
Zend Framework2
Sql
Kotlin
Akka
Compression
Seo
Flutter
Jetty
X86
Blackberry
Drop Down Menu
Office365
Vector
Jms
Download
Asp.net Mvc 2
Windows Phone 8.1
Windows Phone
Json
Pagination
Streaming
Resharper
Ecmascript 6
Botframework
Paypal
Spring Batch
Orientdb
Responsive Design
Jsp
Asp.net Core
Autohotkey
Ms Word
Dynamic
Numpy
Vim
Ibm Mq
Oop
Database
Django
Angular6
Design Patterns
Entity Framework
Sugarcrm
Layout
Data Structures
Editor
Notifications
Ibm Mobilefirst
Function
Python Sphinx
Ruby On Rails
Pyspark
Vaadin
Spring
Google Maps Api 3
Domain Driven Design
Jquery Plugins
Time
Windows Store Apps
Adobe
Random
Sql Server 2005
Phantomjs
Phpstorm
Amazon Redshift
Excel


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网