
使用sed从html输入中删除标记,sed,xmlstarlet,Sed,Xmlstarlet,我有一个html表,我想从中删除具有特定类的行。 但是:当我尝试sed的/*//g时,它什么也不做(比如:不删除标记) 例如,输入可以是: <tr><td>Some col</td></tr> <tr class="expandable"> <td colspan="6"> <div class="expandable-content"> <p>Holds ACCA Pract

我有一个html表,我想从中删除具有特定类的行。 但是:当我尝试sed的/*//g时,它什么也不做(比如:不删除标记)


<tr><td>Some col</td></tr>
<tr class="expandable">
    <td colspan="6">
        <div class="expandable-content">
<p>Holds ACCA Practising Certificate: This indicates a member holding a practising certificate issued by ACCA. This means that the member is authorised to provide a range of general accountancy services to individuals and businesses, including business and tax advice and planning, preparation of personal and business tax returns, set up of book-keeping and business systems, providing book-keeping services, payroll work, assistance with management accounting help with raising finance, budgeting and cash-flow advice, business start-up advice and expert witness.</p>




xmlstarlet ed-d'//tr[@class=“expandable”]假设您的html是有效的XML,您可以使用以下工具:

xmlstarlet ed-d'//tr[@class=“expandable”]”强制链接。“您尝试过使用XML解析器吗?”->xmllint和xidel都不能删除特定的行“类型”-至少我不知道有什么方法我认为显示的示例输入中有输入错误,最后一行可能是
perl-0777-pe的|.*| | | gs'文件
。。。这可能可以工作于perl-0777-pe的|.| | | | gs文件
xmlstarlet ed -d '//tr[@class="expandable"]' <<ENDHTML
  <tr><td>Some col</td></tr>
  <tr class="expandable">
      <td colspan="6">
          <div class="expandable-content">
  <p>Holds ACCA Practising Certificate: This indicates a member holding a practising certificate issued by ACCA. This means that the member is authorised to provide a range of general accountancy services to individuals and businesses, including business and tax advice and planning, preparation of personal and business tax returns, set up of book-keeping and business systems, providing book-keeping services, payroll work, assistance with management accounting help with raising finance, budgeting and cash-flow advice, business start-up advice and expert witness.</p>
<?xml version="1.0"?>
        <td>Some col</td>