Python 使用pandas.read\u csv分隔列_Python_Pandas_Delimiter

Python 使用pandas.read\u csv分隔列

python pandas

Python 使用pandas.read\u csv分隔列,python,pandas,delimiter,Python,Pandas,Delimiter,我试图将一个较大的.txt文件中的一个表读入python 数据摘录如下： 2 Network magnitudes: MLv 2.05 +/- 1.34 7 M 2.05 7 preferred 7 Phase arrivals: sta net dist azi phase time res wt sta BMOR EC 0.0 22

我试图将一个较大的

.txt

文件中的一个表读入python

数据摘录如下：

2 Network magnitudes:
    MLv       2.05 +/- 1.34   7            
    M         2.05            7 preferred  

7 Phase arrivals:
    sta  net   dist azi  phase   time         res     wt  sta
    BMOR  EC    0.0 226  P       00:22:31.385  -0.6 M  1.0  BMOR 
    BREF  EC    0.0 347  P       00:22:31.543  -0.5 M  1.0  BREF 
    BTAM  EC    0.0  58  P       00:22:31.796  -0.3 M  1.0  BTAM 
    BVC2  EC    0.0  26  P       00:22:33.061   0.8 M  1.0  BVC2 
    BNAS  EC    0.1 294  P       00:22:32.871  -0.1 M  1.0  BNAS 
    SUCR  EC    0.1 314  P       00:22:34.610   0.6 M  1.0  SUCR 
    BRRN  EC    0.1 207  P       00:22:34.768   0.4 M  1.0  BRRN 

7 Station magnitudes:
    sta  net   dist azi  type   value   res        amp per
    BMOR  EC    0.0 226  MLv     1.48 -0.57    1.20076

我只需要phase arrivals表，因此

np.loadtext

和

np.genfromtxt

由于各种原因都无法满足要求（无法处理数字和字符串/包含a，除非您只指定一个空格（“”）分隔符，我在这里无法做到）

我一直在尝试使用熊猫。读取_csv功能，但它无法识别分隔符

a = pd.read_csv(datafileloc, sep='\+s', skiprows=5, skipfooter=3)

产生：

a
Out[90]: 
  sta  net   dist azi  phase   time         res     wt  sta
0  BMOR  EC    0.0 226  P       00:22:31.385  -0....       
1  BREF  EC    0.0 347  P       00:22:31.543  -0....       
2  BTAM  EC    0.0  58  P       00:22:31.796  -0....       
3  BVC2  EC    0.0  26  P       00:22:33.061   0....       
4  BNAS  EC    0.1 294  P       00:22:32.871  -0....       
5  SUCR  EC    0.1 314  P       00:22:34.610   0....       
6  BRRN  EC    0.1 207  P       00:22:34.768   0....

除了它们都是一个字符串，而且它没有注意到空格分隔符之外，看起来还不错：

a.values
Out[89]: 
array([['BMOR  EC    0.0 226  P       00:22:31.385  -0.6 M  1.0  BMOR'],
       ['BREF  EC    0.0 347  P       00:22:31.543  -0.5 M  1.0  BREF'],
       ['BTAM  EC    0.0  58  P       00:22:31.796  -0.3 M  1.0  BTAM'],
       ['BVC2  EC    0.0  26  P       00:22:33.061   0.8 M  1.0  BVC2'],
       ['BNAS  EC    0.1 294  P       00:22:32.871  -0.1 M  1.0  BNAS'],
       ['SUCR  EC    0.1 314  P       00:22:34.610   0.6 M  1.0  SUCR'],
       ['BRRN  EC    0.1 207  P       00:22:34.768   0.4 M  1.0  BRRN']], dtype=object)

行可以用

list（a.values[0]）[0]分隔。split（）

但这将需要重新组织以获得单个列。我希望有

pandas.read_csv

只需识别它们是分开的，这样我就可以提取各个列（一旦我放大，合理高效将非常重要）

我哪里出错了？

正如所指出的，这是分隔符中的输入错误：

\s+

，而不是

\+s

它来自于在参数标题下的输入错误。

我不确定是作为输入错误（您需要的是

\s+

，而不是

\+s

）还是作为的副本关闭。嘿@DSM-谢谢！我刚测试过，你说得对，这是个打字错误。然而，我直接从文档中得知，打字错误就在那里：（在delim_whitespace参数下）你说得对！我会确保它得到修复。：-）为了记录在案，这一点已在后备箱中修复。