Python 数据框中的独立列在不同行中具有不同的制表符分隔
我有一个类似这样的测试文件:-Python 数据框中的独立列在不同行中具有不同的制表符分隔,python,python-2.7,python-3.x,pandas,Python,Python 2.7,Python 3.x,Pandas,我有一个类似这样的测试文件:- 2464 2480 2481 Test results for policy NSS-Tuned Test results for policy NSS-Tuned Test results for pol
2464 2480 2481
Test results for policy NSS-Tuned Test results for policy NSS-Tuned Test results for policy NSS-Tuned
BPS Profile Throughput BPS Profile Throughput BPS Profile Throughput
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin 219.1 BPSHTTP21KBINARY 219.16
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml 355.6 BPS-HTTP21K-HTML 364.0
SigTestHTTP21kText 379.95 SigTestHTTP21kText 377.9 BPS-HTTP21K-TEXT 376.25
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay 381.15 BPS-HTTP21K-DELAY 380.2
NSS-HTTPCPS 18920 NSS-HTTPCPS 6599 BPS-HTTPCPS 74.6522222222
SIggTestPerimeter 270.233333333 SIggTestPerimeter 243.433333333 BPS-PERIMETER 222.8
SIgTestDatacenter 370.825 SIgTestDatacenter 380.24 BPS-DATACENTER 373.275
NSS-Financial 5 NSS-Financial BPS-FINANCIAL 56.345
NSS-Education 971.125 NSS-Education 950.4 BPS-EDUCATION 1010.2
NSS-EuroMobile 920.68 NSS-EuroMobile 1001.075 BPS-EUROMOBILE 932.525
NSS-USMobile 528.2 NSS-USMobile 570.6 BPS-USMOBILE 541.9
Test results for policy NSS-Tuned \
BPS Profile Throughput BPS Profile Throughput
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin 219.1
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml 355.6
SigTestHTTP21kText 379.95 SigTestHTTP21kText 377.9
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay 381.15
NSS-HTTPCPS 18920 NSS-HTTPCPS 6599
SIggTestPerimeter 270.233333333 SIggTestPerimeter 243.433333333
SIgTestDatacenter 370.825 SIgTestDatacenter 380.24
NSS-Financial 5 NSS-Financial BPS-FINANCIAL\t56.345
NSS-Education 971.125 NSS-Education 950.4
NSS-EuroMobile 920.68 NSS-EuroMobile 1001.075
NSS-USMobile 528.2 NSS-USMobile 570.6
Test results for policy NSS-Tuned.1 \
BPS Profile Throughput BPS Profile BPS Profile
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin BPSHTTP21KBINARY\t219.16
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml BPS-HTTP21K-HTML\t364.0
SigTestHTTP21kText 379.95 SigTestHTTP21kText BPS-HTTP21K-TEXT\t376.25
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay BPS-HTTP21K-DELAY\t380.2
NSS-HTTPCPS 18920 NSS-HTTPCPS BPS-HTTPCPS\t74.6522222222
SIggTestPerimeter 270.233333333 SIggTestPerimeter BPS-PERIMETER\t222.8
SIgTestDatacenter 370.825 SIgTestDatacenter BPS-DATACENTER\t373.275
NSS-Financial 5 NSS-Financial None
NSS-Education 971.125 NSS-Education BPS-EDUCATION\t1010.2
NSS-EuroMobile 920.68 NSS-EuroMobile BPS-EUROMOBILE\t932.525
NSS-USMobile 528.2 NSS-USMobile BPS-USMOBILE\t541.9
Test results for policy NSS-Tuned.2
BPS Profile Throughput BPS Profile Throughput
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin None
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml None
SigTestHTTP21kText 379.95 SigTestHTTP21kText None
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay None
NSS-HTTPCPS 18920 NSS-HTTPCPS None
SIggTestPerimeter 270.233333333 SIggTestPerimeter None
SIgTestDatacenter 370.825 SIgTestDatacenter None
NSS-Financial 5 NSS-Financial None
NSS-Education 971.125 NSS-Education None
NSS-EuroMobile 920.68 NSS-EuroMobile None
NSS-USMobile 528.2 NSS-USMobile None
['Test results for policy NSS-Tuned', 'Test results for policy NSS-Tuned.1', 'Test results for policy NSS-Tuned.2']
您可以看到第一个标题由4个选项卡(\t\t\t\t)分隔
第二个标题由两个选项卡(\t\t)分隔
后续结果由两个选项卡(\t\t)分隔
现在我需要操纵吞吐量列并生成新列,以便计算百分比等
我编写的代码如下:
#!/usr/bin/python
import time
import os,sys
from os import path
import re
import sys, ast
import subprocess
import numpy as np
#from StringIO import StringIO
import pandas as pd
location = "/root/madhu_test/bpstest/results/finalnss.txt"
#print location
f = pd.read_csv(location,delimiter='\t\t',header=True)
print f
cols = f.columns.tolist()
print cols
f = f.drop('BPS Profile.2', 1)
f = f.drop('BPS Profile.1', 1)
np.radians(f['Throughput'])
np.radians(f['Throughput.1'])
f['percentage'] = ((f['Throughput.1']-f['Throughput'])/f['Throughput.1'])*100.0
f['percentage.1'] = ((f['Throughput.2']-f['Throughput'])/f['Throughput.2'])*100.0
cols = f.columns.tolist()
#print cols
cols = ['BPS Profile', 'Throughput', 'Throughput.1', 'percentage', 'Throughput.2','percentage.1']
f = f[cols]
f.to_html('/root/madhu_test/bpstest/results/outnss.html')
运行代码时,我得到如下输出:-
2464 2480 2481
Test results for policy NSS-Tuned Test results for policy NSS-Tuned Test results for policy NSS-Tuned
BPS Profile Throughput BPS Profile Throughput BPS Profile Throughput
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin 219.1 BPSHTTP21KBINARY 219.16
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml 355.6 BPS-HTTP21K-HTML 364.0
SigTestHTTP21kText 379.95 SigTestHTTP21kText 377.9 BPS-HTTP21K-TEXT 376.25
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay 381.15 BPS-HTTP21K-DELAY 380.2
NSS-HTTPCPS 18920 NSS-HTTPCPS 6599 BPS-HTTPCPS 74.6522222222
SIggTestPerimeter 270.233333333 SIggTestPerimeter 243.433333333 BPS-PERIMETER 222.8
SIgTestDatacenter 370.825 SIgTestDatacenter 380.24 BPS-DATACENTER 373.275
NSS-Financial 5 NSS-Financial BPS-FINANCIAL 56.345
NSS-Education 971.125 NSS-Education 950.4 BPS-EDUCATION 1010.2
NSS-EuroMobile 920.68 NSS-EuroMobile 1001.075 BPS-EUROMOBILE 932.525
NSS-USMobile 528.2 NSS-USMobile 570.6 BPS-USMOBILE 541.9
Test results for policy NSS-Tuned \
BPS Profile Throughput BPS Profile Throughput
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin 219.1
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml 355.6
SigTestHTTP21kText 379.95 SigTestHTTP21kText 377.9
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay 381.15
NSS-HTTPCPS 18920 NSS-HTTPCPS 6599
SIggTestPerimeter 270.233333333 SIggTestPerimeter 243.433333333
SIgTestDatacenter 370.825 SIgTestDatacenter 380.24
NSS-Financial 5 NSS-Financial BPS-FINANCIAL\t56.345
NSS-Education 971.125 NSS-Education 950.4
NSS-EuroMobile 920.68 NSS-EuroMobile 1001.075
NSS-USMobile 528.2 NSS-USMobile 570.6
Test results for policy NSS-Tuned.1 \
BPS Profile Throughput BPS Profile BPS Profile
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin BPSHTTP21KBINARY\t219.16
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml BPS-HTTP21K-HTML\t364.0
SigTestHTTP21kText 379.95 SigTestHTTP21kText BPS-HTTP21K-TEXT\t376.25
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay BPS-HTTP21K-DELAY\t380.2
NSS-HTTPCPS 18920 NSS-HTTPCPS BPS-HTTPCPS\t74.6522222222
SIggTestPerimeter 270.233333333 SIggTestPerimeter BPS-PERIMETER\t222.8
SIgTestDatacenter 370.825 SIgTestDatacenter BPS-DATACENTER\t373.275
NSS-Financial 5 NSS-Financial None
NSS-Education 971.125 NSS-Education BPS-EDUCATION\t1010.2
NSS-EuroMobile 920.68 NSS-EuroMobile BPS-EUROMOBILE\t932.525
NSS-USMobile 528.2 NSS-USMobile BPS-USMOBILE\t541.9
Test results for policy NSS-Tuned.2
BPS Profile Throughput BPS Profile Throughput
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin None
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml None
SigTestHTTP21kText 379.95 SigTestHTTP21kText None
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay None
NSS-HTTPCPS 18920 NSS-HTTPCPS None
SIggTestPerimeter 270.233333333 SIggTestPerimeter None
SIgTestDatacenter 370.825 SIgTestDatacenter None
NSS-Financial 5 NSS-Financial None
NSS-Education 971.125 NSS-Education None
NSS-EuroMobile 920.68 NSS-EuroMobile None
NSS-USMobile 528.2 NSS-USMobile None
['Test results for policy NSS-Tuned', 'Test results for policy NSS-Tuned.1', 'Test results for policy NSS-Tuned.2']
如何将其分为6列,如['BPS Profile','throughts','throughts.1','percentage','throughts.2','percentage.1']
如果我从文本文件中删除以下内容
2464 2480 2481
Test results for policy NSS-Tuned Test results for policy NSS-Tuned Test results for policy NSS-Tuned
然后,数据帧将其正确地分隔为6列
我知道skiprows会忽略这些行,但在生成的最终HTML文件中,我也需要这些数据:
2464 2480 2481
Test results for policy NSS-Tuned Test results for policy NSS-Tuned Test results for policy NSS-Tuned
如果我正确理解您的问题,我会跳过标题,输入数据,然后手动设置列标题
df = pd.read_csv(data.csv, skiprows=7, header=None, delimiter='\t+')
df.columns = ['BPS Profile', 'Throughput', 'BPS Profile.1', 'Throughput.1',
'BPS Profile.2', 'Throughput.2']
从这里可以很容易地操作表…但生成的最终html文件不会包含以下数据:2464 2480 2481策略的测试结果NSS策略的优化测试结果NSS策略的优化测试结果NSS策略的优化测试结果NSS Tunedcan您能将文件读入一个字符串吗?从第一行获取这些数字,然后使用skiprows将数据解析为数据帧?