Python 如何合并这两个数据帧?
我被困在这里,因为我已经为这两个数据帧创建了一个列表。我有两个表,每个表有两列。第一个表有product_name和brand列,第二个表有product_name和shipping列。我正在尝试一对一的连接,这样我就可以在一个表上有三列。它给了我一个错误:KeyError:“shipping”Python 如何合并这两个数据帧?,python,pandas,dataframe,Python,Pandas,Dataframe,我被困在这里,因为我已经为这两个数据帧创建了一个列表。我有两个表,每个表有两列。第一个表有product_name和brand列,第二个表有product_name和shipping列。我正在尝试一对一的连接,这样我就可以在一个表上有三列。它给了我一个错误:KeyError:“shipping” from urllib.request import urlopen as uReq from bs4 import BeautifulSoup as soup import pandas as pd
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
import pandas as pd
from collections import defaultdict
import re
url='https://www.newegg.com/PS4-Video-Games/SubCategory/ID-3141'
with uReq(url) as uClient:
page = uClient.read()
# parsing
page_soup = soup(page, "html.parser")
# grabs products
containers= page_soup.findAll("div",{"class":"item-container"})
# file
filename = "products.csv"
d = defaultdict(list)
d1 = defaultdict(list)
# fill dict
for container in containers:
brand = container.div.div.a.img["title"]
title = container.findAll("a", {"class":"item-title"})
product_name = title[0].text
shipping_container = container.findAll("li", {"class":"price-ship"})
shipping = shipping_container[0].text.strip()
d['brand'].append(brand)
d['product'].append(product_name)
d1['product'].append(product_name)
d1['shipping'].append(shipping)
# create dataframe
df = pd.DataFrame(d)
df1 =pd.DataFrame(d1)
# clean shipping column
df['shipping'] = df['shipping'].apply(lambda x: 0 if x == 'Free Shipping' else x)
df['shipping'] = df['shipping'].apply(lambda x: 0 if x == 'Special Shipping' else x) # probably should be handled in a special way
df['shipping'] = df['shipping'].apply(lambda x: x if x == 0 else re.sub("[^0-9]", "", x))
df['shipping'] = df['shipping'].astype(float)
# save dataframe to csv file
df.to_csv('dataframe.csv', index=False)
df1.to_csv('dataframe1.csv', index=False)
# choose rows where shipping is less than 5.99
#print(df[df['shipping'] > 200])
#merge two data sets
df3 = pd.merge(df,df1)
print(df3)
使用以下命令:
df3 = df.merge(df1, on="product", how="left")
根据您的喜好,您可以使用
how='internal'
或how='outer'
。不太可能。当我运行编译器时,会显示三列,但只显示产品和品牌,而不显示发货。不是品牌数据,而是每行都有点。。。Brand_y 0我们的最后一部分第二部分-PlayStation 4。。。PlayStation 1《我们的最后一个》第二部分-PlayStation 4。。。PlayStation`。我收到了这些,但没有收到发货单