Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/html/72.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 删除与标题匹配的行_Python_Html_Pandas_Duplicates - Fatal编程技术网

Python 删除与标题匹配的行

Python 删除与标题匹配的行,python,html,pandas,duplicates,Python,Html,Pandas,Duplicates,我对熊猫有点陌生,现在我有一个问题 我从一个html网站上读取一个表格,并根据网站上的表格设置标题 df = pd.read_html('http://www.espn.com/nhl/statistics/player/_/stat/points/sort/points/year/2015/seasontype/2', header = 1) 现在我有了带有匹配头的my dataframe,但我有一些行与头相同,如下面的示例 RK PLAYER TEAM

我对熊猫有点陌生,现在我有一个问题

我从一个html网站上读取一个表格,并根据网站上的表格设置标题

 df = pd.read_html('http://www.espn.com/nhl/statistics/player/_/stat/points/sort/points/year/2015/seasontype/2', header = 1)
现在我有了带有匹配头的my dataframe,但我有一些行与头相同,如下面的示例

RK                  PLAYER  TEAM  GP   G   A  PTS  +/-  PIM  PTS/G  SOG 
1          Jamie Benn, LW   DAL  82  35  52   87    1   64   1.06  253   
2         John Tavares, C   NYI  82  38  48   86    5   46   1.05  278   
...      
10  Vladimir Tarasenko, RW   STL  77  37  36   73   27   31   0.95  264   
RK                  PLAYER  TEAM  GP   G   A  PTS  +/-  PIM  PTS/G  SOG 
14       Steven Stamkos, C    TB  82  43  29   72    2   49   0.88  268   
我知道可以使用panda删除重复的行,但是否可以删除与标题或特定行重复的行

希望你能帮助我

您可以使用:

如果需要,还可以删除列
PLAYER
中带有
PP
的行,请使用:

注意:我将
[0]
添加到
read\u html
的末尾,因为它返回数据帧列表,您需要选择列表的第一项:

df = pd.read_html('http://www.espn.com/nhl/statistics/player/_/stat/points/sort/points/year/2015/seasontype/2', header = 1)[0]
print (df)
     RK                  PLAYER  TEAM   GP    G    A  PTS  +/-  PIM  PTS/G  \
0     1          Jamie Benn, LW   DAL   82   35   52   87    1   64   1.06   
1     2         John Tavares, C   NYI   82   38   48   86    5   46   1.05   
2     3        Sidney Crosby, C   PIT   77   28   56   84    5   47   1.09   
3     4       Alex Ovechkin, LW   WSH   81   53   28   81   10   58   1.00   
4   NaN       Jakub Voracek, RW   PHI   82   22   59   81    1   78   0.99   
5     6    Nicklas Backstrom, C   WSH   82   18   60   78    5   40   0.95   
6     7         Tyler Seguin, C   DAL   71   37   40   77   -1   20   1.08   
7     8         Jiri Hudler, LW   CGY   78   31   45   76   17   14   0.97   
8   NaN        Daniel Sedin, LW   VAN   82   20   56   76    5   18   0.93   
9    10  Vladimir Tarasenko, RW   STL   77   37   36   73   27   31   0.95   
10  NaN                      PP    SH  NaN  NaN  NaN  NaN  NaN  NaN    NaN   
11   RK                  PLAYER  TEAM   GP    G    A  PTS  +/-  PIM  PTS/G   
12  NaN        Nick Foligno, LW   CBJ   79   31   42   73   16   50   0.92   
13  NaN        Claude Giroux, C   PHI   81   25   48   73   -3   36   0.90   
14  NaN         Henrik Sedin, C   VAN   82   18   55   73   11   22   0.89   
15   14       Steven Stamkos, C    TB   82   43   29   72    2   49   0.88   
...
...

mask = df['PLAYER'].isin(['PLAYER', 'PP'])
print (df[~mask])
     RK                  PLAYER TEAM  GP   G   A PTS  +/- PIM PTS/G  SOG  \
0     1          Jamie Benn, LW  DAL  82  35  52  87    1  64  1.06  253   
1     2         John Tavares, C  NYI  82  38  48  86    5  46  1.05  278   
2     3        Sidney Crosby, C  PIT  77  28  56  84    5  47  1.09  237   
3     4       Alex Ovechkin, LW  WSH  81  53  28  81   10  58  1.00  395   
4   NaN       Jakub Voracek, RW  PHI  82  22  59  81    1  78  0.99  221   
5     6    Nicklas Backstrom, C  WSH  82  18  60  78    5  40  0.95  153   
6     7         Tyler Seguin, C  DAL  71  37  40  77   -1  20  1.08  280   
7     8         Jiri Hudler, LW  CGY  78  31  45  76   17  14  0.97  158   
8   NaN        Daniel Sedin, LW  VAN  82  20  56  76    5  18  0.93  226   
9    10  Vladimir Tarasenko, RW  STL  77  37  36  73   27  31  0.95  264   
12  NaN        Nick Foligno, LW  CBJ  79  31  42  73   16  50  0.92  182   
13  NaN        Claude Giroux, C  PHI  81  25  48  73   -3  36  0.90  279   
14  NaN         Henrik Sedin, C  VAN  82  18  55  73   11  22  0.89  101   
15   14       Steven Stamkos, C   TB  82  43  29  72    2  49  0.88  268   
16  NaN        Tyler Johnson, C   TB  77  29  43  72   33  24  0.94  203   
17   16        Ryan Johansen, C  CBJ  82  26  45  71   -6  40  0.87  202   
18   17         Joe Pavelski, C   SJ  82  37  33  70   12  29  0.85  261   
19  NaN        Evgeni Malkin, C  PIT  69  28  42  70   -2  60  1.01  212   
20  NaN         Ryan Getzlaf, C  ANA  77  25  45  70   15  62  0.91  191   
21   20           Rick Nash, LW  NYR  79  42  27  69   29  36  0.87  304   
...
...
df = pd.read_html('http://www.espn.com/nhl/statistics/player/_/stat/points/sort/points/year/2015/seasontype/2', header = 1)[0]
print (df)
     RK                  PLAYER  TEAM   GP    G    A  PTS  +/-  PIM  PTS/G  \
0     1          Jamie Benn, LW   DAL   82   35   52   87    1   64   1.06   
1     2         John Tavares, C   NYI   82   38   48   86    5   46   1.05   
2     3        Sidney Crosby, C   PIT   77   28   56   84    5   47   1.09   
3     4       Alex Ovechkin, LW   WSH   81   53   28   81   10   58   1.00   
4   NaN       Jakub Voracek, RW   PHI   82   22   59   81    1   78   0.99   
5     6    Nicklas Backstrom, C   WSH   82   18   60   78    5   40   0.95   
6     7         Tyler Seguin, C   DAL   71   37   40   77   -1   20   1.08   
7     8         Jiri Hudler, LW   CGY   78   31   45   76   17   14   0.97   
8   NaN        Daniel Sedin, LW   VAN   82   20   56   76    5   18   0.93   
9    10  Vladimir Tarasenko, RW   STL   77   37   36   73   27   31   0.95   
10  NaN                      PP    SH  NaN  NaN  NaN  NaN  NaN  NaN    NaN   
11   RK                  PLAYER  TEAM   GP    G    A  PTS  +/-  PIM  PTS/G   
12  NaN        Nick Foligno, LW   CBJ   79   31   42   73   16   50   0.92   
13  NaN        Claude Giroux, C   PHI   81   25   48   73   -3   36   0.90   
14  NaN         Henrik Sedin, C   VAN   82   18   55   73   11   22   0.89   
15   14       Steven Stamkos, C    TB   82   43   29   72    2   49   0.88   
...
...

mask = df['PLAYER'].isin(['PLAYER', 'PP'])
print (df[~mask])
     RK                  PLAYER TEAM  GP   G   A PTS  +/- PIM PTS/G  SOG  \
0     1          Jamie Benn, LW  DAL  82  35  52  87    1  64  1.06  253   
1     2         John Tavares, C  NYI  82  38  48  86    5  46  1.05  278   
2     3        Sidney Crosby, C  PIT  77  28  56  84    5  47  1.09  237   
3     4       Alex Ovechkin, LW  WSH  81  53  28  81   10  58  1.00  395   
4   NaN       Jakub Voracek, RW  PHI  82  22  59  81    1  78  0.99  221   
5     6    Nicklas Backstrom, C  WSH  82  18  60  78    5  40  0.95  153   
6     7         Tyler Seguin, C  DAL  71  37  40  77   -1  20  1.08  280   
7     8         Jiri Hudler, LW  CGY  78  31  45  76   17  14  0.97  158   
8   NaN        Daniel Sedin, LW  VAN  82  20  56  76    5  18  0.93  226   
9    10  Vladimir Tarasenko, RW  STL  77  37  36  73   27  31  0.95  264   
12  NaN        Nick Foligno, LW  CBJ  79  31  42  73   16  50  0.92  182   
13  NaN        Claude Giroux, C  PHI  81  25  48  73   -3  36  0.90  279   
14  NaN         Henrik Sedin, C  VAN  82  18  55  73   11  22  0.89  101   
15   14       Steven Stamkos, C   TB  82  43  29  72    2  49  0.88  268   
16  NaN        Tyler Johnson, C   TB  77  29  43  72   33  24  0.94  203   
17   16        Ryan Johansen, C  CBJ  82  26  45  71   -6  40  0.87  202   
18   17         Joe Pavelski, C   SJ  82  37  33  70   12  29  0.85  261   
19  NaN        Evgeni Malkin, C  PIT  69  28  42  70   -2  60  1.01  212   
20  NaN         Ryan Getzlaf, C  ANA  77  25  45  70   15  62  0.91  191   
21   20           Rick Nash, LW  NYR  79  42  27  69   29  36  0.87  304   
...
...