Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/ruby/22.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Ruby 将复杂文件拆分为哈希_Ruby - Fatal编程技术网

Ruby 将复杂文件拆分为哈希

Ruby 将复杂文件拆分为哈希,ruby,Ruby,我正在运行一个名为Primer 3的命令行程序。它接受一个输入文件并将数据返回到标准输出。我正在尝试编写一个Ruby脚本,它将接受该输入,并将条目放入散列 返回的结果如下所示。我想拆分“=”号上的数据,以便has的内容如下: {:SEQUENCE_ID => "example", :SEQUENCE_TEMPLATE => "GTAGTCAGTAGACNAT..etc", :SEQUENCE_TARGET => "37,21" etc } 我还想将键的大小写降低,即: {:

我正在运行一个名为Primer 3的命令行程序。它接受一个输入文件并将数据返回到标准输出。我正在尝试编写一个Ruby脚本,它将接受该输入,并将条目放入散列

返回的结果如下所示。我想拆分“=”号上的数据,以便has的内容如下:

{:SEQUENCE_ID => "example", :SEQUENCE_TEMPLATE => "GTAGTCAGTAGACNAT..etc", :SEQUENCE_TARGET => "37,21" etc }
我还想将键的大小写降低,即:

 {:sequence_id => "example", :sequence_template => "GTAGTCAGTAGACNAT..etc", :sequence_target => "37,21" etc }
这是我当前的脚本:

#!/usr/bin/ruby
puts 'Primer 3 hash'

primer3 = {}
while line = gets do
  name, height = line.split(/\=/)
  primer3[name] = height.to_i
end

puts primer3
它返回的是:

Primer 3 hash
{"SEQUENCE_ID"=>0, "SEQUENCE_TEMPLATE"=>0, "SEQUENCE_TARGET"=>37, "PRIMER_TASK"=>0,     "PRIMER_PICK_LEFT_PRIMER"=>1, "PRIMER_PICK_INTERNAL_OLIGO"=>1,  "PRIMER_PICK_RIGHT_PRIMER"=>1, "PRIMER_OPT_SIZE"=>18, "PRIMER_MIN_SIZE"=>15, "PRIMER_MAX_SIZE"=>21, "PRIMER_MAX_NS_ACCEPTED"=>1, "PRIMER_PRODUCT_SIZE_RANGE"=>75, "P3_FILE_FLAG"=>1, "SEQUENCE_INTERNAL_EXCLUDED_REGION"=>37, "PRIMER_EXPLAIN_FLAG"=>1, "PRIMER_THERMODYNAMIC_PARAMETERS_PATH"=>0, "PRIMER_LEFT_EXPLAIN"=>0, "PRIMER_RIGHT_EXPLAIN"=>0, "PRIMER_INTERNAL_EXPLAIN"=>0, "PRIMER_PAIR_EXPLAIN"=>0, "PRIMER_LEFT_NUM_RETURNED"=>0, "PRIMER_RIGHT_NUM_RETURNED"=>0, "PRIMER_INTERNAL_NUM_RETURNED"=>0, "PRIMER_PAIR_NUM_RETURNED"=>0, ""=>0}
数据源

SEQUENCE\u ID=示例
序列\模板=GTAGTCAGTAGACNATGACNACTGACGATGACNACACACACACACACACACACACACAGGTATTAGGGCATCGATCCCGACACACACAAATCGATCGATCGATCGATACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACAGAGAGATGC
序列\目标=37,21
引物任务=拾取检测引物
底漆\拾取\左侧\底漆=1
底漆\拾取\内部\低聚物=1
底漆\u选择\u右侧\u底漆=1
底漆选择尺寸=18
底漆最小尺寸=15
底漆最大尺寸=21
底漆\u最大\u NS\u已接受=1
底漆\产品\尺寸\范围=75-100
P3_文件_标志=1
序列\内部\排除\区域=37,21
底漆\u解释\u标志=1
PRIMER\u热力学参数\u路径=/usr/local/ceral/primer3/2.3.4/bin/primer3\u配置/
底漆_左_解释=65,过多Ns 17,低tm 48,正常0
底漆\右\解释=考虑228,低tm 159,高tm 12,高发夹稳定性22,正常35
PRIMER\u INTERNAL\u EXPLAIN=考虑为0,正常为0
底漆\u对\u解释=考虑为0,正常为0
PRIMER\u LEFT\u NUM\u返回值=0
PRIMER\u RIGHT\u NUM\u返回值=0
底漆\u内部\u数量\u返回=0
返回的引物对数=0
=
$primer3_core
好的,我已经拿到了(差不多了)。唯一的问题是在每个值的末尾添加一个\n

puts 'Primer 3 hash'

primer3 = {}
while line = gets do
  key, value = line.split(/\=/)
  puts key
  puts value
  primer3[key.downcase] = value
end

puts primer3

{"sequence_id"=>"example\n",  "sequence_template"=>"GTAGTCAGTAGACNATGACNACTGACGATGCAGACNACACACACACACACAGCACACAGGTATTAGTGGGCCATTCGATCCCGACCCAAATCGATAGCTACGATGACG\n", "sequence_target"=>"37,21\n", "primer_task"=>"pick_detection_primers\n", "primer_pick_left_primer"=>"1\n", "primer_pick_internal_oligo"=>"1\n", "primer_pick_right_primer"=>"1\n", "primer_opt_size"=>"18\n", "primer_min_size"=>"15\n", "primer_max_size"=>"21\n", "primer_max_ns_accepted"=>"1\n", "primer_product_size_range"=>"75-100\n", "p3_file_flag"=>"1\n", "sequence_internal_excluded_region"=>"37,21\n", "primer_explain_flag"=>"1\n", "primer_thermodynamic_parameters_path"=>"/usr/local/Cellar/primer3/2.3.4/bin/primer3_config/\n", "primer_left_explain"=>"considered 65, too many Ns 17, low tm 48, ok 0\n", "primer_right_explain"=>"considered 228, low tm 159, high tm 12, high hairpin stability 22, ok 35\n", "primer_internal_explain"=>"considered 0, ok 0\n", "primer_pair_explain"=>"considered 0, ok 0\n", "primer_left_num_returned"=>"0\n", "primer_right_num_returned"=>"0\n", "primer_internal_num_returned"=>"0\n", "primer_pair_num_returned"=>"0\n", ""=>"\n"}
#/usr/bin/ruby
放置“Primer 3哈希”
primer3={}
当line=开始工作时
键,值=行。拆分(/=/,2)
primer3[key.downcase.to_sym]=value.chomp
结束
放置素数3

为了好玩,这里有两个纯功能解决方案。两者都假设您已经从文件中提取了数据,例如

my_data = ARGF.read # read the file passed on the command line
这条感觉有点恶心,但它是一条(长的)单行线:)

这是两行,但感觉比使用带有索引的
更干净:

keys,values = my_data.lines.map{ |line| line.chomp.split('=',2) }.transpose
hash = Hash[ keys.map(&:downcase).map(&:to_sym).zip(values) ]
这两个答案的效率可能都比你已经接受的答案低,而且肯定比你已经接受的答案更让人记忆犹新;迭代行并缓慢地改变散列是最好的方法。这些非突变变异只是一种心理训练


您的最终答案应该使用
ARGF
在命令行或通过STDIN允许文件名。我想这样写:

#!/usr/bin/ruby

module Primer3
  def self.parse( file )
    {}.tap do |primer3|
      # Process one line at a time, without reading it all into memory first
      file.each_line do |line|  
        key, value = line.chomp.split('=', 2)
        primer3[key.downcase.to_sym] = value
      end
    end
  end
end

Primer3.parse( ARGF ) if __FILE__==$0

通过这种方式,您可以从命令行调用该文件(带或不带STDIN),也可以
要求
该文件并使用它在其他代码中定义的模块函数。

.chomp
添加到值以删除换行符。@SeanGeneva我没有投反对票,但与问题标准不完全匹配的答案通常会被投反对票。例如,这个答案使用
获取
而不是文件,并使用字符串而不是键作为符号。是的,这太完美了!谢谢。一个问题:2在这里做什么:line.split(/=/,2)@SeanGeneva限制从
split
中获取的字段,以防值部分包含
=
。哇,太棒了。干杯。一个结构合理的完整问题+1,包括你已经尝试过的。精彩的答案。非常感谢你!
#!/usr/bin/ruby

module Primer3
  def self.parse( file )
    {}.tap do |primer3|
      # Process one line at a time, without reading it all into memory first
      file.each_line do |line|  
        key, value = line.chomp.split('=', 2)
        primer3[key.downcase.to_sym] = value
      end
    end
  end
end

Primer3.parse( ARGF ) if __FILE__==$0