Python 在列表中更改刮取的字符串（转换为float和back）_Python_Xml_Type Conversion_Scrape

Python 在列表中更改刮取的字符串（转换为float和back）

python xml

Python 在列表中更改刮取的字符串（转换为float和back）,python,xml,type-conversion,scrape,Python,Xml,Type Conversion,Scrape,我正在练习抓取网站，我得到了一系列的价格。我不太熟悉清单以及它们的工作原理，所以我不确定，但我想将美元兑换为澳元，这大约只是1美元兑1.32美元的比率。我假设字符串是第一个eval（）成为浮点列表，然后可能只是乘以1.32，但我不确定如何实际进行比率交换： from tkinter import * from re import findall, MULTILINE rss = open('rss.xhtml', encoding="utf8").read() # prints 10

我正在练习抓取网站，我得到了一系列的价格。我不太熟悉清单以及它们的工作原理，所以我不确定，但我想将美元兑换为澳元，这大约只是1美元兑1.32美元的比率。我假设字符串是第一个eval（）成为浮点列表，然后可能只是乘以1.32，但我不确定如何实际进行比率交换：

from tkinter import *
from re import findall, MULTILINE

rss = open('rss.xhtml', encoding="utf8").read()

    # prints 10 price values
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
price = ["$" + regex_test for regex_test in regex_test] 
for cost in range(10):
    print(price[cost])

这将打印10个价格，其中=>表示转换到下一个价格，即20美元变成26.40澳元：

$20.00=>$26.40

$16.00=>21.12美元

$23.50=>$31.02

$20.00=>$26.40

$16.00=>21.12美元

189.00美元=>249.48美元

$16.00=>21.12美元

为了便于使用相同的正则表达式来获取价格，这里有一个类似的rss提要

使用的范围为10，因为我不希望刮去数百个条目，只从顶部刮下几个条目。

使for循环更具python风格：

from tkinter import *k    from re import findall, MULTILINE

rss = open('rss.xhtml', encoding="utf8").read()

    # prints 10 price values
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
price = ["$" + regex_test for regex_test in regex_test] 
for individual_price in price:
    print(individual_price)

要将列表转换为AUD，假设您只想乘以一个值，对于您的代码，最好在添加美元符号之前返回列表：

aud_usd_ratio = 1.32 # 1.32 AUD to 1 USD
aud_price_list = ["$" + str(float(x)*aud_usd_ratio) for x in regex_test]
print(aud_price_list)

如果需要这两个小数位，也可以使用字符串格式：

aud_price_list = ["${:.2f}".format(float(x)*aud_usd_ratio ) for x in regex_test]
print(aud_price_list)

使for循环更具python风格：

from tkinter import *k    from re import findall, MULTILINE

rss = open('rss.xhtml', encoding="utf8").read()

    # prints 10 price values
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
price = ["$" + regex_test for regex_test in regex_test] 
for individual_price in price:
    print(individual_price)

要将列表转换为AUD，假设您只想乘以一个值，对于您的代码，最好在添加美元符号之前返回列表：

aud_usd_ratio = 1.32 # 1.32 AUD to 1 USD
aud_price_list = ["$" + str(float(x)*aud_usd_ratio) for x in regex_test]
print(aud_price_list)

如果需要这两个小数位，也可以使用字符串格式：

aud_price_list = ["${:.2f}".format(float(x)*aud_usd_ratio ) for x in regex_test]
print(aud_price_list)

假设

regex\u test

与我的

prices\u list\u usd

相同：

prices_list_usd = [11.11,12.22,21.324,3.11]
usd_aud_ratio = 1.32
prices_list_aud = [price*usd_aud_ratio for price in prices_list_usd]
combined_list = zip(prices_list_usd,prices_list_aud)
for pair in combined_list:
    print("$USD {0} => $AUD {1}".format(pair[0],pair[1]))

假设

regex\u test

与我的

prices\u list\u usd

相同：

prices_list_usd = [11.11,12.22,21.324,3.11]
usd_aud_ratio = 1.32
prices_list_aud = [price*usd_aud_ratio for price in prices_list_usd]
combined_list = zip(prices_list_usd,prices_list_aud)
for pair in combined_list:
    print("$USD {0} => $AUD {1}".format(pair[0],pair[1]))

我认为您需要提取所有值，将它们转换为float，然后相应地格式化

# I don't know rss file so dummy variable
rss = "$20.00 => $26.40  $20.00 => $26.40  $16.00 => $21.12  $189.00 => $249.48"

costs = re.findall(r'(?<=\$)\d+\.\d+', rss)

# cast to float and multiply with 1.32
costs = [float(cost) * 1.32 for cost in costs]

# now format them
for i in range(0, len(costs), 2):
    print("{:.2f} => {:.2f}".format(costs[i], costs[i + 1]))

# output

# 26.40 => 34.85
# 26.40 => 34.85
# 21.12 => 27.88
# 249.48 => 329.31

#我不知道rss文件有这么多虚拟变量
rss=“$20.00=>26.40$20.00=>26.40$16.00=>21.12$189.00=>249.48”
costs=re.findall（r’（？我认为您需要提取所有值，将它们转换为float，然后相应地格式化
# I don't know rss file so dummy variable
rss = "$20.00 => $26.40  $20.00 => $26.40  $16.00 => $21.12  $189.00 => $249.48"

costs = re.findall(r'(?<=\$)\d+\.\d+', rss)

# cast to float and multiply with 1.32
costs = [float(cost) * 1.32 for cost in costs]

# now format them
for i in range(0, len(costs), 2):
    print("{:.2f} => {:.2f}".format(costs[i], costs[i + 1]))

# output

# 26.40 => 34.85
# 26.40 => 34.85
# 21.12 => 27.88
# 249.48 => 329.31

#我不知道rss文件有这么多虚拟变量
rss=“$20.00=>26.40$20.00=>26.40$16.00=>21.12$189.00=>249.48”
costs=re.findall（r’（？对glycoaddict的解决方案稍作更改，即可在列表中创建更新价格列表或类似的“变量”，然后从列表中分别调用列表中的每个值：
# installs necessary modules
from tkinter import *
from re import findall, MULTILINE
import urllib.request

# downloads an rss feed to use, the feel is downloaded, 
# then saved under name and format (xhtml, html, etc.)
urllib.request.urlretrieve("https://www.etsy.com/au/shop/ElvenTechnology/rss", "rss.xhtml")
# opens the downloaded file to read from, 'U' can be used instead
# of 'encoding="utf8"', however this causes issues on some feeds, for
# example this particulare feed needs to be encoded in utf8 otherwise
# a decoding error occurs as shown below;

# return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 
# 'charmap' codec can't decode byte 0x9d in position 12605: character maps to <unidentified>


rss = open('rss.xhtml', encoding="utf8").read()
# regex is used to find all instances within the document which was opened
# and called rss
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
# formats the returned string to be modified to desired value (glycoaddict)
# aud_usd_ratio = 1.32 is the same as simply using 1.32, this just creates
# a variable with a value of 1.32 to multuply rather than simply 1.32 itself
AUD_price = ["${:.2f}".format(float(USD)*1.32) for USD in regex_test]
# loops the function 10 times, this is to stop rss feeds with thousands 
# of returns listing endlessly, this only returns the first 10, which are
# taken out of the created and formatted/modified string list, and prints
# each value individually, which is useful for say a list of label
# in tkinter to be looped and placed 
for individual_item_price in range(10):
    print(AUD_price[individual_item_price])

#安装必要的模块
从tkinter进口*
从重新导入findall，多行
导入urllib.request
#下载要使用的rss源，感觉被下载，
#然后以名称和格式（xhtml、html等）保存
urllib.request.urlretrieve（“https://www.etsy.com/au/shop/ElvenTechnology/rss“，“rss.xhtml”）
#打开下载的文件进行读取，可以使用“U”代替
#对于'encoding=“utf8”'，但这会导致某些提要出现问题，例如
#示例此特定提要需要用utf8编码，否则
#出现如下所示的解码错误；
#返回编解码器.charmap\u解码（输入、自身错误、解码表）[0]UnicodeDecodeError:
#“charmap”编解码器无法解码12605位置的字节0x9d：字符映射到
rss=open（'rss.xhtml'，encoding=“utf8”）.read（）
#regex用于查找已打开文档中的所有实例
#并称之为rss
regex_test=findall（r'（[0-9]+[.]*[0-9]*）美元，rss）
#将要修改的返回字符串格式化为所需值（glycoaddict）
#澳元兑美元比率=1.32与简单使用1.32相同，这只会产生
#一个值为1.32到1倍的变量，而不仅仅是1.32本身
澳元价格=[“${:.2f}”。格式（美元浮动（USD）*1.32）在正则表达式测试中表示美元]
#循环函数10次，这是为了停止成千上万的rss提要
#在无休止的返回列表中，这只返回前10个，即
#从创建和格式化/修改的字符串列表中取出，并打印
#每个值都是单独的，这对于标签列表很有用
#在tkinter中循环并放置
对于范围（10）内的单个项目价格：
打印（澳元价格[个别项目价格]）

请注意，每次运行此程序时，rss文件都会被下载和更新，这意味着可以将其视为实时价格，现在运行此程序，一小时或几小时后将返回不同的结果。
稍微更改一下glycoaddict的解决方案，更新价格列表或类似的“变量”可以在列表中创建，然后从中分别调用列表中的每个值：
# installs necessary modules
from tkinter import *
from re import findall, MULTILINE
import urllib.request

# downloads an rss feed to use, the feel is downloaded, 
# then saved under name and format (xhtml, html, etc.)
urllib.request.urlretrieve("https://www.etsy.com/au/shop/ElvenTechnology/rss", "rss.xhtml")
# opens the downloaded file to read from, 'U' can be used instead
# of 'encoding="utf8"', however this causes issues on some feeds, for
# example this particulare feed needs to be encoded in utf8 otherwise
# a decoding error occurs as shown below;

# return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 
# 'charmap' codec can't decode byte 0x9d in position 12605: character maps to <unidentified>


rss = open('rss.xhtml', encoding="utf8").read()
# regex is used to find all instances within the document which was opened
# and called rss
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
# formats the returned string to be modified to desired value (glycoaddict)
# aud_usd_ratio = 1.32 is the same as simply using 1.32, this just creates
# a variable with a value of 1.32 to multuply rather than simply 1.32 itself
AUD_price = ["${:.2f}".format(float(USD)*1.32) for USD in regex_test]
# loops the function 10 times, this is to stop rss feeds with thousands 
# of returns listing endlessly, this only returns the first 10, which are
# taken out of the created and formatted/modified string list, and prints
# each value individually, which is useful for say a list of label
# in tkinter to be looped and placed 
for individual_item_price in range(10):
    print(AUD_price[individual_item_price])

#安装必要的模块
从tkinter进口*
从重新导入findall，多行
导入urllib.request
#下载要使用的rss源，感觉被下载，
#然后以名称和格式（xhtml、html等）保存
urllib.request.urlretrieve（“https://www.etsy.com/au/shop/ElvenTechnology/rss“，“rss.xhtml”）
#打开下载的文件进行读取，可以使用“U”代替
#对于'encoding=“utf8”'，但这会导致某些提要出现问题，例如
#示例此特定提要需要用utf8编码，否则
#出现如下所示的解码错误；
#返回编解码器.charmap\u解码（输入、自身错误、解码表）[0]UnicodeDecodeError:
#“charmap”编解码器无法解码12605位置的字节0x9d：字符映射到
rss=open（'rss.xhtml'，encoding=“utf8”）.read（）
#regex用于查找已打开文档中的所有实例
#并称之为rss
regex_test=findall（r'（[0-9]+[.]*[0-9]*）美元，rss）
#将要修改的返回字符串格式化为所需值（glycoaddict）
#澳元兑美元比率=1.32与简单使用1.32相同，这只会产生
#一个值为1.32到1倍的变量，而不仅仅是1.32本身
澳元价格=[“${:.2f}”。格式（美元浮动（USD）*1.32）在正则表达式测试中表示美元]
#循环函数10次，这是为了停止成千上万的rss提要
#在无休止的返回列表中，这只返回前10个，即
#从已创建和格式化的