Python-将字节/unicode制表符分隔的数据转换为csv文件
我从API中提取以下数据行。数据以Python-将字节/unicode制表符分隔的数据转换为csv文件,python,python-3.x,csv,byte,ascii,Python,Python 3.x,Csv,Byte,Ascii,我从API中提取以下数据行。数据以b前缀开始,根据,这将表明我们正在处理“字节文字”,转义序列\t和\n分别表示ASCII水平制表符(制表符)和ASCII换行符(LF) b'settlement-id\tsettlement-start-date\tsettlement-end-date\tdeposit-date\ttotal-amount\tcurrency\ttransaction-type\torder-id\tmerchant-order-id\tadjustment-id\tship
b
前缀开始,根据,这将表明我们正在处理“字节文字”,转义序列\t
和\n
分别表示ASCII水平制表符(制表符)和ASCII换行符(LF)
b'settlement-id\tsettlement-start-date\tsettlement-end-date\tdeposit-date\ttotal-amount\tcurrency\ttransaction-type\torder-id\tmerchant-order-id\tadjustment-id\tshipment-id\tmarketplace-name\tamount-type\tamount-description\tamount\tfulfillment-id\tposted-date\tposted-date-time\torder-item-code\tmerchant-order-item-id\tmerchant-adjustment-item-id\tsku\tquantity-purchased\n7293436482\t03.05.2018 09:10:07 UTC\t04.05.2018 20:30:23 UTC\t06.05.2018 20:30:23 UTC\t53,44\tEUR\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\n7293436482\t\t\t\t\t\tOrder\t303-3746292-6119509\t\t\tDRGC8lFbB\tAmazon.de\tItemPrice\tPrincipal\t179,99\tMFN\t03.05.2018\t03.05.2018 17:12:22 UTC\t30407746733299\t\t\t3700546702556-180412-chp-18c10347-1\t1\n7293436482\t\t\t\t\t\tOrder\t303-3746292-6119509\t\t\tDRGC8lFbB\tAmazon.de\tItemFees\tCommission\t-32,40\tMFN\t03.05.2018\t03.05.2018 17:12:22 UTC\t30407746733299\t\t\t3700546702556-180412-chp-18c10347-1\t1\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemPrice\tPrincipal\t-109,99\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemFees\tCommission\t19,80\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemFees\tRefundCommission\t-3,96\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n'
当我使用.decode(“utf-8”)
将此数据转换为字符串时,我会得到相应的制表符分隔数据:
settlement-id settlement-start-date settlement-end-date deposit-date total-amount currency transaction-type order-id merchant-order-id adjustment-id shipment-id marketplace-name amount-type amount-description amount fulfillment-id posted-date posted-date-time order-item-code merchant-order-item-id merchant-adjustment-item-id sku quantity-purchased
7293436482 03.05.2018 09:10:07 UTC 04.05.2018 20:30:23 UTC 06.05.2018 20:30:23 UTC 53,44 EUR
7293436482 Order 303-3746292-6119509 DRGC8lFbB Amazon.de ItemPrice Principal 179,99 MFN 03.05.2018 03.05.2018 17:12:22 UTC 30407746733299 3700546702556-180412-chp-18c10347-1 1
7293436482 Order 303-3746292-6119509 DRGC8lFbB Amazon.de ItemFees Commission -32,40 MFN 03.05.2018 03.05.2018 17:12:22 UTC 30407746733299 3700546702556-180412-chp-18c10347-1 1
7293436482 Refund 305-1251749-5602732 305-1251749-5602732 amzn1:crow:YZkTuxs4RhO8FpZez3cGCg Amazon.de ItemPrice Principal -109,99 AFN 04.05.2018 04.05.2018 18:24:39 UTC 38048998219979 142721169810 3700546702082-180124-jpn-131N28-6
7293436482 Refund 305-1251749-5602732 305-1251749-5602732 amzn1:crow:YZkTuxs4RhO8FpZez3cGCg Amazon.de ItemFees Commission 19,80 AFN 04.05.2018 04.05.2018 18:24:39 UTC 38048998219979 142721169810 3700546702082-180124-jpn-131N28-6
7293436482 Refund 305-1251749-5602732 305-1251749-5602732 amzn1:crow:YZkTuxs4RhO8FpZez3cGCg Amazon.de ItemFees RefundCommission -3,96 AFN 04.05.2018 04.05.2018 18:24:39 UTC 38048998219979 142721169810 3700546702082-180124-jpn-131N28-6
但是,我似乎无法将此数据保存到以制表符分隔的csv文件中。我尝试了几种方法将此数据保存到csv文件,但均失败,包括以下内容:
with open("folder_GET_V2_SETTLEMENT_REPORT_DATA_FLAT_FILE_V2_/" + grl_id + ".csv", "w") as csv_file:
writer = csv.writer(csv_file)
for row in csv_file:
print(row)
这给了我以下错误:
for row in csv_file:
io.UnsupportedOperation: not readable
更新:
所以问题出在别处。实际上,在我的各种测试中,我已经成功地生成了与您相同的文件,但由于输出看起来有误,我认为它不起作用。在excel中打开文件时,数据分为两列
我现在已经弄明白了原因是有一些数字使用欧洲的小数方式,这是一种昏迷
179,99
。因此,Excel将其解释为一个分隔符,而如果我在记事本中打开该文件,它将正确读取 您之所以会出现错误,是因为您希望将数据写入csv文件,但在for循环中,您试图从该文件读取数据。如果我理解正确,您希望接收bytes对象,并将其很好地写入一个以选项卡分隔的csv文件中。以下代码将执行此操作:
import csv, re
orig = b'settlement-id\tsettlement-start-date\tsettlement-end-date\tdeposit-date\ttotal-amount\tcurrency\ttransaction-type\torder-id\tmerchant-order-id\tadjustment-id\tshipment-id\tmarketplace-name\tamount-type\tamount-description\tamount\tfulfillment-id\tposted-date\tposted-date-time\torder-item-code\tmerchant-order-item-id\tmerchant-adjustment-item-id\tsku\tquantity-purchased\n7293436482\t03.05.2018 09:10:07 UTC\t04.05.2018 20:30:23 UTC\t06.05.2018 20:30:23 UTC\t53,44\tEUR\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\n7293436482\t\t\t\t\t\tOrder\t303-3746292-6119509\t\t\tDRGC8lFbB\tAmazon.de\tItemPrice\tPrincipal\t179,99\tMFN\t03.05.2018\t03.05.2018 17:12:22 UTC\t30407746733299\t\t\t3700546702556-180412-chp-18c10347-1\t1\n7293436482\t\t\t\t\t\tOrder\t303-3746292-6119509\t\t\tDRGC8lFbB\tAmazon.de\tItemFees\tCommission\t-32,40\tMFN\t03.05.2018\t03.05.2018 17:12:22 UTC\t30407746733299\t\t\t3700546702556-180412-chp-18c10347-1\t1\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemPrice\tPrincipal\t-109,99\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemFees\tCommission\t19,80\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemFees\tRefundCommission\t-3,96\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n'
# Split the long string into a list of lines
data = orig.decode('utf-8').splitlines()
# Open the file for writing
with open("tmp.csv", "w") as csv_file:
# Create the writer object with tab delimiter
writer = csv.writer(csv_file, delimiter = '\t')
for line in data:
# Writerow() needs a list of data to be written, so split at all empty spaces in the line
writer.writerow(re.split('\s+',line))
您是在
“w”
写入模式下打开文件,您应该在“r”
读取模式下打开,或者您在那里混合了csv\u文件名……感谢您提供了干净的解决方案,但在使用excel打开时,请检查我的更新以了解分隔符问题。将数据导入excel时,请尝试自己指定分隔符。然后它就起作用了。