Php 用于筛选表的Regexp_Php_Ruby_Regex_Ruby On Rails 3_Codeigniter - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/ruby/20.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php 用于筛选表的Regexp_Php_Ruby_Regex_Ruby On Rails 3_Codeigniter - Fatal编程技术网

Php 用于筛选表的Regexp

php ruby regex ruby-on-rails-3 codeigniter

Php 用于筛选表的Regexp,php,ruby,regex,ruby-on-rails-3,codeigniter,Php,Ruby,Regex,Ruby On Rails 3,Codeigniter,好吧，我有一个表，它由一些开源软件输出，但它没有以实际的表格式输出，例如 <table> <thead> <td>Heading</td> <thead> <tbody> <tr> <td>Content</td> </tr> <tbody> </table 所以我不能建立一个网络刮板来获取数据，或者

好吧，我有一个表，它由一些开源软件输出，但它没有以实际的表格式输出，例如

<table> 
  <thead>
     <td>Heading</td>
  <thead>
  <tbody>
    <tr>
       <td>Content</td>
    </tr>
  <tbody>
</table

所以我不能建立一个网络刮板来获取数据，或者我不是舒尔，如果我可以建立一个刮板来刮板，因为它都被包装在一个


text.lines.to_a.each do |line|
   line.sub(/^\| |^\+*-*\+*\-*/) do |match|
    puts "Regexp Match: " << match
end
STDIN.getc
puts "New Line "<< line
end

例如，第一行的输出将仅为+--------------+------------
它是CSV格式的，因此我将使用Gsub
将剩余的
替换为，

我可以使用PHP或Ruby，因此任何答案都是非常受欢迎的
对于从表中获取字段的主要工作，请使用带有模式的split
来获取每一行：
$table = '+------------+-------------+-------+-------------+------------+---------------+----------+
| HEADING 1  | HEADING 2   | ETC   | ANOTHER     | HEADING3   | HEADING4     | SML |
+------------+-------------+-------+-------------+------------+---------------+----------+
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
+------------+-------------+-------+-------------+------------+--------------+----------+
| TOTALS        AGENTS:21  |  total|        total|       total|         total| total|
+------------+-------------+-------+-------------+------------+--------------+----------+';

$lines = preg_split('/\r\n|\r|\n/', $table);
$array = array();
foreach($lines as $line){
  if(!preg_match('/\+-+\+/', $line)){
    $array[] = preg_split('/\s*\|\s*/', trim($line, '| '));
  }
}

print_r($array);

这将根据每个|
和周围的任何空格将行拆分为一个数组。丢弃数组的第一个和最后一个元素，因为模式也匹配开头和结尾|
签出：
Array
(
    [0] => Array
        (
            [0] => HEADING 1
            [1] => HEADING 2
            [2] => ETC
            [3] => ANOTHER
            [4] => HEADING3
            [5] => HEADING4
            [6] => SML
        )

    [1] => Array
        (
            [0] => content
            [1] => more content
            [2] => cont
            [3] => More more
            [4] => content
            [5] => content 2.0
            [6] => litl
        )

    [2] => Array
        (
            [0] => content
            [1] => more content
            [2] => cont
            [3] => More more
            [4] => content
            [5] => content 2.0
            [6] => litl
        )

    [3] => Array
        (
            [0] => content
            [1] => more content
            [2] => cont
            [3] => More more
            [4] => content
            [5] => content 2.0
            [6] => litl
        )

    [4] => Array
        (
            [0] => content
            [1] => more content
            [2] => cont
            [3] => More more
            [4] => content
            [5] => content 2.0
            [6] => litl
        )

    [5] => Array
        (
            [0] => content
            [1] => more content
            [2] => cont
            [3] => More more
            [4] => content
            [5] => content 2.0
            [6] => litl
        )

    [6] => Array
        (
            [0] => content
            [1] => more content
            [2] => cont
            [3] => More more
            [4] => content
            [5] => content 2.0
            [6] => litl
        )

    [7] => Array
        (
            [0] => content
            [1] => more content
            [2] => cont
            [3] => More more
            [4] => content
            [5] => content 2.0
            [6] => litl
        )

    [8] => Array
        (
            [0] => content
            [1] => more content
            [2] => cont
            [3] => More more
            [4] => content
            [5] => content 2.0
            [6] => litl
        )

    [9] => Array
        (
            [0] => TOTALS        AGENTS:21
            [1] => total
            [2] => total
            [3] => total
            [4] => total
            [5] => total
        )

)

输出：
require 'builder'

table = '+------------+-------------+-------+-------------+------------+---------------+----------+
| HEADING 1  | HEADING 2   | ETC   | ANOTHER     | HEADING3   | HEADING4     | SML |
+------------+-------------+-------+-------------+------------+---------------+----------+
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
+------------+-------------+-------+-------------+------------+--------------+----------+
| TOTALS        AGENTS:21  |  total|        total|       total|         total| total|
+------------+-------------+-------+-------------+------------+--------------+----------+';

def parse_table(table)
  rows = []
  table.each_line do |line|
    next if line.match /^\+/
    rows << line.split(/\s*\|\s*/).reject(&:empty?) 
  end
  rows
end

def html_row(xml, columns)
  xml.tr do
    columns.each do |column|
      xml.td column
    end
  end
end

def html_table(rows)
  head_row = rows.first
  body_rows = rows[1..-1]

  xml = Builder::XmlMarkup.new :indent => 2
  xml.table do
    xml.thead do
      html_row xml, head_row
    end
    xml.tbody do
      body_rows.each do |body_row|
        html_row xml, body_row
      end
    end
  end.to_s
end


rows = parse_table(table)
html = html_table(rows)
puts html

@text = <<END
+------------+-------------+-------+-------------+------------+---------------+----------+
| HEADING 1  | HEADING 2   | ETC   | ANOTHER     | HEADING3   | HEADING4     | SML |
+------------+-------------+-------+-------------+------------+---------------+----------+
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
+------------+-------------+-------+-------------+------------+--------------+----------+
| TOTALS        AGENTS:21  |  total|        total|       total|         total| total|
+------------+-------------+-------+-------------+------------+--------------+----------+
END
s = @text.scan(/^[|]\W(.*)[|]$/)
puts s
arr = []
arr2 = []
s.each do |o|
  a = o.to_s.split('|')
    a.each do |oo|
      arr2 << oo.to_s.gsub('["','').gsub('"]','').gsub(/\s+/, "")
    end
    arr << arr2
  arr2 = []
end
arr.each do |i|
  puts i
end

希望这有帮助：）
这是一个完整的ruby解决方案。不过，您需要在最后一行手动添加一个|

<table>
  <thead>
    <tr>
      <td>HEADING 1</td>
      <td>HEADING 2</td>
      <td>ETC</td>
      <td>ANOTHER</td>
      <td>HEADING3</td>
      <td>HEADING4</td>
      <td>SML</td>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>content</td>
      <td>more content</td>
      <td>cont</td>
      <td>More more</td>
      <td>content</td>
      <td>content 2.0</td>
      <td>litl</td>
    </tr>
    <tr>
      <td>content</td>
      <td>more content</td>
      <td>cont</td>
      <td>More more</td>
      <td>content</td>
      <td>content 2.0</td>
      <td>litl</td>
    </tr>
    <tr>
      <td>content</td>
      <td>more content</td>
      <td>cont</td>
      <td>More more</td>
      <td>content</td>
      <td>content 2.0</td>
      <td>litl</td>
    </tr>
    <tr>
      <td>content</td>
      <td>more content</td>
      <td>cont</td>
      <td>More more</td>
      <td>content</td>
      <td>content 2.0</td>
      <td>litl</td>
    </tr>
    <tr>
      <td>content</td>
      <td>more content</td>
      <td>cont</td>
      <td>More more</td>
      <td>content</td>
      <td>content 2.0</td>
      <td>litl</td>
    </tr>
    <tr>
      <td>content</td>
      <td>more content</td>
      <td>cont</td>
      <td>More more</td>
      <td>content</td>
      <td>content 2.0</td>
      <td>litl</td>
    </tr>
    <tr>
      <td>content</td>
      <td>more content</td>
      <td>cont</td>
      <td>More more</td>
      <td>content</td>
      <td>content 2.0</td>
      <td>litl</td>
    </tr>
    <tr>
      <td>content</td>
      <td>more content</td>
      <td>cont</td>
      <td>More more</td>
      <td>content</td>
      <td>content 2.0</td>
      <td>litl</td>
    </tr>
    <tr>
      <td>TOTALS        AGENTS:21</td>
      <td>total</td>
      <td>total</td>
      <td>total</td>
      <td>total</td>
      <td>total</td>
    </tr>
  </tbody>
</table>

需要“生成器”
桌子+------------+-------------+-------+-------------+------------+---------------+----------+
|品目1 |品目2 |等|另一|品目3 |品目4 | SML|
+------------+-------------+-------+-------------+------------+---------------+----------+
|内容|更多内容|继续|更多|内容|内容2.0 | litl|
|内容|更多内容|继续|更多|内容|内容2.0 | litl|
|内容|更多内容|继续|更多|内容|内容2.0 | litl|
|内容|更多内容|继续|更多|内容|内容2.0 | litl|
|内容|更多内容|继续|更多|内容|内容2.0 | litl|
|内容|更多内容|继续|更多|内容|内容2.0 | litl|
|内容|更多内容|继续|更多|内容|内容2.0 | litl|
|内容|更多内容|继续|更多|内容|内容2.0 | litl|
+------------+-------------+-------+-------------+------------+--------------+----------+
|总计代理：21 |总计|总计|总计|总计|总计|
+------------+-------------+-------+-------------+------------+--------------+----------+';
def parse_表格（表格）
行=[]
表1.每行do|
下一个if line.match/^\+/
第2行
xml.do表
xml.thead-do
html\u行xml，头\u行
结束
xml.tbody-do
身体排。每个都做身体排|
html_行xml，body_行
结束
结束
完
结束
行=解析表（表）
html=html\u表格（行）
放置html

输出：
require 'builder'

table = '+------------+-------------+-------+-------------+------------+---------------+----------+
| HEADING 1  | HEADING 2   | ETC   | ANOTHER     | HEADING3   | HEADING4     | SML |
+------------+-------------+-------+-------------+------------+---------------+----------+
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
+------------+-------------+-------+-------------+------------+--------------+----------+
| TOTALS        AGENTS:21  |  total|        total|       total|         total| total|
+------------+-------------+-------+-------------+------------+--------------+----------+';

def parse_table(table)
  rows = []
  table.each_line do |line|
    next if line.match /^\+/
    rows << line.split(/\s*\|\s*/).reject(&:empty?) 
  end
  rows
end

def html_row(xml, columns)
  xml.tr do
    columns.each do |column|
      xml.td column
    end
  end
end

def html_table(rows)
  head_row = rows.first
  body_rows = rows[1..-1]

  xml = Builder::XmlMarkup.new :indent => 2
  xml.table do
    xml.thead do
      html_row xml, head_row
    end
    xml.tbody do
      body_rows.each do |body_row|
        html_row xml, body_row
      end
    end
  end.to_s
end


rows = parse_table(table)
html = html_table(rows)
puts html

@text = <<END
+------------+-------------+-------+-------------+------------+---------------+----------+
| HEADING 1  | HEADING 2   | ETC   | ANOTHER     | HEADING3   | HEADING4     | SML |
+------------+-------------+-------+-------------+------------+---------------+----------+
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
+------------+-------------+-------+-------------+------------+--------------+----------+
| TOTALS        AGENTS:21  |  total|        total|       total|         total| total|
+------------+-------------+-------+-------------+------------+--------------+----------+
END
s = @text.scan(/^[|]\W(.*)[|]$/)
puts s
arr = []
arr2 = []
s.each do |o|
  a = o.to_s.split('|')
    a.each do |oo|
      arr2 << oo.to_s.gsub('["','').gsub('"]','').gsub(/\s+/, "")
    end
    arr << arr2
  arr2 = []
end
arr.each do |i|
  puts i
end


标题1
标题2
等
另一个
头3
头4
SML
内容
更多内容
续
更多
内容
内容2.0
利特尔
内容
更多内容
续
更多
内容
内容2.0
利特尔
内容
更多内容
续
更多
内容
内容2.0
利特尔
内容
更多内容
续
更多
内容
内容2.0
利特尔
内容
更多内容
续
更多
内容
内容2.0
利特尔
内容
更多内容
续
更多
内容
内容2.0
利特尔
内容
更多内容
续
更多
内容
内容2.0
利特尔
内容
更多内容
续
更多
内容
内容2.0
利特尔
总数：21
全部的
全部的
全部的
全部的
全部的
这可能不像可能的那么干净，但它适用于此示例：）
红宝石：
@text=使用HTML解析器选择pre
标记中的文本，然后使用子字符串提取数据（我假设列位于固定位置）。如果列的宽度在一个表中固定，但在另一个表中不固定，然后，您可以分析标题以计算出每列的宽度currently@nhahtdh这些列的宽度不是固定的，我希望它们是啊哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈？如果内容中没有出现
，则可以按
进行拆分。固定宽度是指每一列的宽度是固定的（不同的列可能有不同的宽度，但一列的所有行必须有相同的宽度）。所有的全局变量是什么？在这里使用它们有什么意义？@paile哇太棒了，然后我只需要像一个迷你铲运机一样构建，从本地文件中获取数据，然后导出到CSV？或者有什么好东西吗？所以你想要的输出是CSV？看看ruby std中的CSV类lib@paddle是的，因为第一次使用fastercsv，所以拍摄的帮助进行了调查，但它似乎被贬值了？




[ruby]相关文章推荐



                                                        
                                       





随机文章推荐



                                                        
openshift中的自动原木滚动？
openshift 
克隆Openshift Tomcat Mysql应用程序时出错
openshift 
OpenShift:Corn作业脚本，用于备份mysql并通过电子邮件发送备份文件
openshift 
Openshift上被阻止的请求？
openshift 
openshift和let'；s加密证书
openshift 
Openshift 通过在wildfly中配置，是否有任何方法只允许访问来自同一域的请求
openshift 
为什么一些OpenShift命令会特别处理PersistentVolumeClaims
openshift 
所有SCC和x27的详细说明；OpenShift中的s
openshift 
Openshift 4.4引导完成时出现问题
openshift


                                        

                                        
                                        


                                                
                                                        [php]相关推荐
                                                        
Php 具有正确权限的自定义日志文件
									Php
							 									Permissions
							 									Apache2
							 
创建新的php变量，该变量从以前的变量递增，而不更改php中的初始变量
									Php
							 									Variables
							 
将绝对URL添加到一些php+；HTML
									Php
							 									Html
							 
Php 表单/herdocs与Ajax与。。。。？什么是'；最佳做法'；用于交互式网站开发？
									Php
							 									Ajax
							 
Php 仅回显值名称不在数组中
									Php
							 									Codeigniter
							 									File Upload
							 
php a“是如何实现的？”；保持登录状态”；复选框会影响日志记录过程吗？
									Php
							 									Session
							 									Cookies
							 
Php 使用mysql_real_escape_string（）时拒绝对www数据的Joomla访问
									Php
							 									Mysql
							 									Joomla
							 
Php/mysql错误
									Php
							 									Mysql
							 									Database
							 									Xampp
							 
PHP在include文件中打开大括号，但未关闭
									Php
							 
Php 有可能在JSF中集成第三方应用程序吗？
									Php
							 									Jsf
							 									Jsf 2
							 									Web
							 
使用PHP生成多个图像
									Php
							 									Image Processing
							 
Php Joomla扩展更新-将列添加到数据库表
									Php
							 									Mysql
							 									Joomla
							 
PHP MySQL中更新记录的最安全方法
									Php
							 									Mysql
							 
Php MySQL连接和回显结果
									Php
							 									Mysql
							 									Join
							 
为什么PHP不'；当重写具有不同签名的方法时，是否不显示严格的标准错误？
									Php
							 
用于空搜索的php else语句
									Php
							 									Mysql
							 									If Statement
							 									Search
							 
运行并指向另一个文件的php文件
									Php
							 									Redirect
							 
Php Symfony 2.8:Can'；无法在自定义类中使用容器
									Php
							 									Class
							 									Symfony
							 									Service
							 
Php mysql使用GROUPBY获取行值
									Php
							 									Mysql
							 
Php 如何正确保护发送到服务器的查询字符串？
									Php
							 									Security
							 
Php 为什么remove_meta_box（）函数在我的插件中不起作用？
									Php
							 									Wordpress
							 
PHP0IN开关语句
									Php
							 
再次启用apache2，使用.htaccess将html文件解析为php7
									Php
							 									.htaccess
							 									Apache2
							 
Php AngularJS post json数据+；codeigniter中的服务器端验证
									Php
							 									Angularjs
							 									Codeigniter
							 
PHP中的XML解析和函数问题
									Php
							 									Xml
							 
Php “使用查询字符串参数”有什么区别吗；？“1”；或子文件夹"/“是吗？”；？
									Php
							 									Wordpress
							 									Amp Html
							 
Php 如何找到数组中键值的位置
									Php
							 									Arrays
							 
Php “如何改变”；“添加到购物车”；在商业中？
									Php
							 									Wordpress
							 									Woocommerce
							 
Php 从子类别页面上的产品阵列中获取所有品牌
									Php
							 									Laravel
							 
Php 将订单的总订单项数量另存为自定义元数据
									Php
							 									Wordpress
							 									Object
							 									Woocommerce
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Entity Framework
Arrays
Virtualbox
Hive
Design Patterns
Matplotlib
Drupal 7
Reactjs
D3.js
Tridion
Menu
Influxdb
Objective C
Ionic2
Ruby
Joomla
Codenameone
Processing
Vagrant
Asynchronous
Zend Framework2
Azure Ad B2c
Bots
Unix
Filter
Csv
Microsoft Graph Api
Lua
Gulp
Android Layout
Airflow
Centos
Netlogo
Vhdl
Erlang
Oracle
Imagemagick
Salesforce
Ibm Mobilefirst
Boost
Maps
Activerecord
Ignite
Map
Zend Framework
Amazon S3
Kendo Ui
Snowflake Cloud Data Platform
Active Directory
Opencv
Exchange Server
Ide
Powerbi
Talend
Sharepoint 2010
Sublimetext2
Nginx
Sphinx
Asterisk
Jsf
Woocommerce
Indexing
Migration
Google App Engine
Rxjs
Authentication
Node.js
Vbscript
Liferay
Editor
Dialogflow Es
Sass
Methods
Programming Languages
Ionic Framework
Fluent Nhibernate
Tags
Jsp
Kernel
Serialization
Stripe Payments
Windows Services
Jetty
Windows 7
Nosql
Webpack
Sqlite
Compilation
Ajax
Multithreading
Animation
Grafana
Asp.net Mvc 4
Ibm Cloud
Sugarcrm
Orientdb
Phpstorm
Sql Server 2008 R2
Jaxb
Iphone
Sparql
Dataframe
Internet Explorer
Domain Driven Design
Memory Leaks
Ipython
Modelica
Exception
Binding
Sublimetext3
Time
Import
Caching
Sql Server 2012
Azure Service Fabric
Project Management
Uwp
Requirejs
Blazor
Amazon Dynamodb
Tcp
Url
Architecture
Rust
Math
Twitter Bootstrap 3
If Statement
Azure
Apache Zookeeper
.htaccess
Ruby On Rails 3
Highcharts
Sas
Internationalization
Apache Flink
Windows
Network Programming
Pine Script
Windows 10
Hbase
Testng
Dask
Ios4
Pip
Git
Compiler Construction
Certificate
Osgi
Javafx
Swift3
Weblogic
Dojo
Dns
Gruntjs
Twitter Bootstrap
Gcc
Google Cloud Dataflow
C++ Cli
Plsql
Visual Studio 2010
.net 4.0
Cygwin
Hadoop
Plugins
Stm32
Spring Mvc
Model View Controller
Visual Studio 2013
Aframe
Jboss
C# 3.0
Jira
Dotnetnuke
Ssrs 2008
Facebook
Akka
Excel
Scroll
Here Api
Kdb
Biztalk
Charts
Wpf
Dom
Loopbackjs
Pycharm
Robotframework
Selenium Webdriver
Elm
Office365
Sorting
Open Source
Actionscript 3
Sonarqube
Playframework
Composer Php
Umbraco
Junit
Bootstrap 4
Blockchain
Mqtt


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网