Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/string/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Html 如何根据需要的格式修改BeautifulSoup的get_text函数?_Html_Text_Beautifulsoup_Screen Scraping_Strsplit - Fatal编程技术网

Html 如何根据需要的格式修改BeautifulSoup的get_text函数?

Html 如何根据需要的格式修改BeautifulSoup的get_text函数?,html,text,beautifulsoup,screen-scraping,strsplit,Html,Text,Beautifulsoup,Screen Scraping,Strsplit,我想刮网页。我用的是BeautifulSoup url="https://www.blockchain.com/btc/block/00000000000000000011898368c395f1c35d56ea9109d439256d935a4fe7d656" page=requests.get(url) soup=BeautifulSoup(page.text,'html.parser') block_details=soup.find(class_="hnfgic-0 jlMXIC")

我想刮网页。我用的是BeautifulSoup

url="https://www.blockchain.com/btc/block/00000000000000000011898368c395f1c35d56ea9109d439256d935a4fe7d656" 
page=requests.get(url)
soup=BeautifulSoup(page.text,'html.parser')
block_details=soup.find(class_="hnfgic-0 jlMXIC")
print block_details.get_text()
输出为:

Hash00000000000000000011898368c395f1c35d56ea9109d439256d935a4fe7d656Confirmations8Timestamp2019-11-21 17:52Height604806MinerSlushPoolNumber of Transactions2,003Difficulty12,973,235,968,799.78Merkle root49ee8cb431ef3e613fdc9ac3146335d1a608a0e6afb5cf9ab44c9ddc51acfbe9Version0x20000000Bits387,297,854Weight3,993,364 WUSize1,355,728 bytesNonce849,455,972Transaction Volume4560.73542334 BTCBlock Reward12.50000000 BTCFee Reward0.19346486 BTC
但我希望输出为:

Hash
00000000000000000011898368c395f1c35d56ea9109d439256d935a4fe7d656
Confirmations
8
Timestamp
2019-11-21 17:52
Height
604806
.
.
.
我打算对这个字符串使用
strsplit
函数。因此,两个文本之间的结束行分隔符将帮助我使用
strsplit(“\n”)
来区分字符串。 请帮忙


编辑:Selenium的
.text
函数生成我想要的输出,但我需要使用BeautifulSoup进行修复。

您可以将
分隔符='\n'
参数添加到
获取文本()
方法:

url="https://www.blockchain.com/btc/block/00000000000000000011898368c395f1c35d56ea9109d439256d935a4fe7d656" 
page=requests.get(url)
soup=BeautifulSoup(page.text,'html.parser')
block_details=soup.find(class_="hnfgic-0 jlMXIC")
print block_details.get_text()
import requests
from bs4 import BeautifulSoup

url="https://www.blockchain.com/btc/block/00000000000000000011898368c395f1c35d56ea9109d439256d935a4fe7d656"
page=requests.get(url)
soup=BeautifulSoup(page.text,'html.parser')
block_details=soup.find(class_="hnfgic-0 jlMXIC")
print(block_details.get_text(separator='\n'))  # <-- note the separator parameter