Python 在列表中更改刮取的字符串(转换为float和back)
我正在练习抓取网站,我得到了一系列的价格。我不太熟悉清单以及它们的工作原理,所以我不确定,但我想将美元兑换为澳元,这大约只是1美元兑1.32美元的比率。我假设字符串是第一个eval()成为浮点列表,然后可能只是乘以1.32,但我不确定如何实际进行比率交换:Python 在列表中更改刮取的字符串(转换为float和back),python,xml,type-conversion,scrape,Python,Xml,Type Conversion,Scrape,我正在练习抓取网站,我得到了一系列的价格。我不太熟悉清单以及它们的工作原理,所以我不确定,但我想将美元兑换为澳元,这大约只是1美元兑1.32美元的比率。我假设字符串是第一个eval()成为浮点列表,然后可能只是乘以1.32,但我不确定如何实际进行比率交换: from tkinter import * from re import findall, MULTILINE rss = open('rss.xhtml', encoding="utf8").read() # prints 10
from tkinter import *
from re import findall, MULTILINE
rss = open('rss.xhtml', encoding="utf8").read()
# prints 10 price values
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
price = ["$" + regex_test for regex_test in regex_test]
for cost in range(10):
print(price[cost])
这将打印10个价格,其中=>表示转换到下一个价格,即20美元变成26.40澳元:
使用的范围为10,因为我不希望刮去数百个条目,只从顶部刮下几个条目。使for循环更具python风格:
from tkinter import *k from re import findall, MULTILINE
rss = open('rss.xhtml', encoding="utf8").read()
# prints 10 price values
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
price = ["$" + regex_test for regex_test in regex_test]
for individual_price in price:
print(individual_price)
要将列表转换为AUD,假设您只想乘以一个值,对于您的代码,最好在添加美元符号之前返回列表:
aud_usd_ratio = 1.32 # 1.32 AUD to 1 USD
aud_price_list = ["$" + str(float(x)*aud_usd_ratio) for x in regex_test]
print(aud_price_list)
如果需要这两个小数位,也可以使用字符串格式:
aud_price_list = ["${:.2f}".format(float(x)*aud_usd_ratio ) for x in regex_test]
print(aud_price_list)
使for循环更具python风格:
from tkinter import *k from re import findall, MULTILINE
rss = open('rss.xhtml', encoding="utf8").read()
# prints 10 price values
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
price = ["$" + regex_test for regex_test in regex_test]
for individual_price in price:
print(individual_price)
要将列表转换为AUD,假设您只想乘以一个值,对于您的代码,最好在添加美元符号之前返回列表:
aud_usd_ratio = 1.32 # 1.32 AUD to 1 USD
aud_price_list = ["$" + str(float(x)*aud_usd_ratio) for x in regex_test]
print(aud_price_list)
如果需要这两个小数位,也可以使用字符串格式:
aud_price_list = ["${:.2f}".format(float(x)*aud_usd_ratio ) for x in regex_test]
print(aud_price_list)
假设
regex\u test
与我的prices\u list\u usd
相同:
prices_list_usd = [11.11,12.22,21.324,3.11]
usd_aud_ratio = 1.32
prices_list_aud = [price*usd_aud_ratio for price in prices_list_usd]
combined_list = zip(prices_list_usd,prices_list_aud)
for pair in combined_list:
print("$USD {0} => $AUD {1}".format(pair[0],pair[1]))
假设
regex\u test
与我的prices\u list\u usd
相同:
prices_list_usd = [11.11,12.22,21.324,3.11]
usd_aud_ratio = 1.32
prices_list_aud = [price*usd_aud_ratio for price in prices_list_usd]
combined_list = zip(prices_list_usd,prices_list_aud)
for pair in combined_list:
print("$USD {0} => $AUD {1}".format(pair[0],pair[1]))
我认为您需要提取所有值,将它们转换为float,然后相应地格式化
# I don't know rss file so dummy variable
rss = "$20.00 => $26.40 $20.00 => $26.40 $16.00 => $21.12 $189.00 => $249.48"
costs = re.findall(r'(?<=\$)\d+\.\d+', rss)
# cast to float and multiply with 1.32
costs = [float(cost) * 1.32 for cost in costs]
# now format them
for i in range(0, len(costs), 2):
print("{:.2f} => {:.2f}".format(costs[i], costs[i + 1]))
# output
# 26.40 => 34.85
# 26.40 => 34.85
# 21.12 => 27.88
# 249.48 => 329.31
#我不知道rss文件有这么多虚拟变量
rss=“$20.00=>26.40$20.00=>26.40$16.00=>21.12$189.00=>249.48”
costs=re.findall(r’(?我认为您需要提取所有值,将它们转换为float,然后相应地格式化
# I don't know rss file so dummy variable
rss = "$20.00 => $26.40 $20.00 => $26.40 $16.00 => $21.12 $189.00 => $249.48"
costs = re.findall(r'(?<=\$)\d+\.\d+', rss)
# cast to float and multiply with 1.32
costs = [float(cost) * 1.32 for cost in costs]
# now format them
for i in range(0, len(costs), 2):
print("{:.2f} => {:.2f}".format(costs[i], costs[i + 1]))
# output
# 26.40 => 34.85
# 26.40 => 34.85
# 21.12 => 27.88
# 249.48 => 329.31
#我不知道rss文件有这么多虚拟变量
rss=“$20.00=>26.40$20.00=>26.40$16.00=>21.12$189.00=>249.48”
costs=re.findall(r’(?对glycoaddict的解决方案稍作更改,即可在列表中创建更新价格列表或类似的“变量”,然后从列表中分别调用列表中的每个值:
# installs necessary modules
from tkinter import *
from re import findall, MULTILINE
import urllib.request
# downloads an rss feed to use, the feel is downloaded,
# then saved under name and format (xhtml, html, etc.)
urllib.request.urlretrieve("https://www.etsy.com/au/shop/ElvenTechnology/rss", "rss.xhtml")
# opens the downloaded file to read from, 'U' can be used instead
# of 'encoding="utf8"', however this causes issues on some feeds, for
# example this particulare feed needs to be encoded in utf8 otherwise
# a decoding error occurs as shown below;
# return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError:
# 'charmap' codec can't decode byte 0x9d in position 12605: character maps to <unidentified>
rss = open('rss.xhtml', encoding="utf8").read()
# regex is used to find all instances within the document which was opened
# and called rss
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
# formats the returned string to be modified to desired value (glycoaddict)
# aud_usd_ratio = 1.32 is the same as simply using 1.32, this just creates
# a variable with a value of 1.32 to multuply rather than simply 1.32 itself
AUD_price = ["${:.2f}".format(float(USD)*1.32) for USD in regex_test]
# loops the function 10 times, this is to stop rss feeds with thousands
# of returns listing endlessly, this only returns the first 10, which are
# taken out of the created and formatted/modified string list, and prints
# each value individually, which is useful for say a list of label
# in tkinter to be looped and placed
for individual_item_price in range(10):
print(AUD_price[individual_item_price])
#安装必要的模块
从tkinter进口*
从重新导入findall,多行
导入urllib.request
#下载要使用的rss源,感觉被下载,
#然后以名称和格式(xhtml、html等)保存
urllib.request.urlretrieve(“https://www.etsy.com/au/shop/ElvenTechnology/rss“,“rss.xhtml”)
#打开下载的文件进行读取,可以使用“U”代替
#对于'encoding=“utf8”',但这会导致某些提要出现问题,例如
#示例此特定提要需要用utf8编码,否则
#出现如下所示的解码错误;
#返回编解码器.charmap\u解码(输入、自身错误、解码表)[0]UnicodeDecodeError:
#“charmap”编解码器无法解码12605位置的字节0x9d:字符映射到
rss=open('rss.xhtml',encoding=“utf8”).read()
#regex用于查找已打开文档中的所有实例
#并称之为rss
regex_test=findall(r'([0-9]+[.]*[0-9]*)美元,rss)
#将要修改的返回字符串格式化为所需值(glycoaddict)
#澳元兑美元比率=1.32与简单使用1.32相同,这只会产生
#一个值为1.32到1倍的变量,而不仅仅是1.32本身
澳元价格=[“${:.2f}”。格式(美元浮动(USD)*1.32)在正则表达式测试中表示美元]
#循环函数10次,这是为了停止成千上万的rss提要
#在无休止的返回列表中,这只返回前10个,即
#从创建和格式化/修改的字符串列表中取出,并打印
#每个值都是单独的,这对于标签列表很有用
#在tkinter中循环并放置
对于范围(10)内的单个项目价格:
打印(澳元价格[个别项目价格])
请注意,每次运行此程序时,rss文件都会被下载和更新,这意味着可以将其视为实时价格,现在运行此程序,一小时或几小时后将返回不同的结果。稍微更改一下glycoaddict的解决方案,更新价格列表或类似的“变量”可以在列表中创建,然后从中分别调用列表中的每个值:
# installs necessary modules
from tkinter import *
from re import findall, MULTILINE
import urllib.request
# downloads an rss feed to use, the feel is downloaded,
# then saved under name and format (xhtml, html, etc.)
urllib.request.urlretrieve("https://www.etsy.com/au/shop/ElvenTechnology/rss", "rss.xhtml")
# opens the downloaded file to read from, 'U' can be used instead
# of 'encoding="utf8"', however this causes issues on some feeds, for
# example this particulare feed needs to be encoded in utf8 otherwise
# a decoding error occurs as shown below;
# return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError:
# 'charmap' codec can't decode byte 0x9d in position 12605: character maps to <unidentified>
rss = open('rss.xhtml', encoding="utf8").read()
# regex is used to find all instances within the document which was opened
# and called rss
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
# formats the returned string to be modified to desired value (glycoaddict)
# aud_usd_ratio = 1.32 is the same as simply using 1.32, this just creates
# a variable with a value of 1.32 to multuply rather than simply 1.32 itself
AUD_price = ["${:.2f}".format(float(USD)*1.32) for USD in regex_test]
# loops the function 10 times, this is to stop rss feeds with thousands
# of returns listing endlessly, this only returns the first 10, which are
# taken out of the created and formatted/modified string list, and prints
# each value individually, which is useful for say a list of label
# in tkinter to be looped and placed
for individual_item_price in range(10):
print(AUD_price[individual_item_price])
#安装必要的模块
从tkinter进口*
从重新导入findall,多行
导入urllib.request
#下载要使用的rss源,感觉被下载,
#然后以名称和格式(xhtml、html等)保存
urllib.request.urlretrieve(“https://www.etsy.com/au/shop/ElvenTechnology/rss“,“rss.xhtml”)
#打开下载的文件进行读取,可以使用“U”代替
#对于'encoding=“utf8”',但这会导致某些提要出现问题,例如
#示例此特定提要需要用utf8编码,否则
#出现如下所示的解码错误;
#返回编解码器.charmap\u解码(输入、自身错误、解码表)[0]UnicodeDecodeError:
#“charmap”编解码器无法解码12605位置的字节0x9d:字符映射到
rss=open('rss.xhtml',encoding=“utf8”).read()
#regex用于查找已打开文档中的所有实例
#并称之为rss
regex_test=findall(r'([0-9]+[.]*[0-9]*)美元,rss)
#将要修改的返回字符串格式化为所需值(glycoaddict)
#澳元兑美元比率=1.32与简单使用1.32相同,这只会产生
#一个值为1.32到1倍的变量,而不仅仅是1.32本身
澳元价格=[“${:.2f}”。格式(美元浮动(USD)*1.32)在正则表达式测试中表示美元]
#循环函数10次,这是为了停止成千上万的rss提要
#在无休止的返回列表中,这只返回前10个,即
#从已创建和格式化的