Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/sql-server-2005/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Algorithm 选择尽可能多的行,以保证每列的项目密度_Algorithm - Fatal编程技术网

Algorithm 选择尽可能多的行,以保证每列的项目密度

Algorithm 选择尽可能多的行,以保证每列的项目密度,algorithm,Algorithm,假设我们有一个0-1矩阵,如: 0, 1, 1, 1, 1 1, 1, 1, 1, 1 1, 0, 1, 0, 1 1, 1, 1, 1, 1 0, 1, 1, 0, 1 1, 1, 0, 1, 0 目标是从该矩阵中选择尽可能多的行,以形成新矩阵,并确保在新矩阵中,每列包含不少于80%的1 例如,对于上述矩阵,结果将为: 0, 1, 1, 1, 1 1, 1, 1, 1, 1 1, 0, 1, 0, 1 1, 1, 1, 1, 1 1, 1, 0, 1, 0 贪婪算

假设我们有一个0-1矩阵,如:

0, 1, 1, 1, 1 
1, 1, 1, 1, 1 
1, 0, 1, 0, 1 
1, 1, 1, 1, 1 
0, 1, 1, 0, 1 
1, 1, 0, 1, 0  
目标是从该矩阵中选择尽可能多的行,以形成新矩阵,并确保在新矩阵中,每列包含不少于80%的
1

例如,对于上述矩阵,结果将为:

0, 1, 1, 1, 1 
1, 1, 1, 1, 1 
1, 0, 1, 0, 1 
1, 1, 1, 1, 1 
1, 1, 0, 1, 0  
贪婪算法显然不适用于这个问题,正如我所看到的,普通DP也不适用

在实际问题中,矩阵将有大约7000行和100列。因为会有一些all-1行,所以总会存在至少一个解决方案


有人能帮我启发一下吗?谢谢

简单答案的实现:

(过于简单,因为搜索是“胆小的”-在评估状态是否可接受之前只需一步,它不会重新排序以查找有助于列更快达到80%的行

code.rb: 7000个随机行的时间:

实现:2742 接受以下接受:[1、0、0、0、0、0、0、0、0、0、1、1、1、1、1、0、0、1、1、1、1、1、1、1、1、1、1、1、1、1、0、0、0、0、1、0、1、0、1、0、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、0、1、1、1、1、1、1、1、1、0、1、1、1、1、1、1、1、1、0、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1、1,1,1,0,1,1] 实现:2743 实现:2743 解决方案。百分比=[81.18847976667881, 80.42289464090412, 81.00619759387531, 80.93328472475392, 80.67808968282901, 80.38643820634341, 81.22493620123952, 81.55304411228582, 81.22493620123952, 80.20415603353992, 80.45935107546481, 81.22493620123952, 80.56872037914691, 80.02187386073642, 81.15202333211812, 82.39154210718192, 80.02187386073642, 80.24061246810062, 81.26139263580022, 80.38643820634341, 80.02187386073642, 80.27706890266131, 80.16769959897921, 81.00619759387531, 80.49580751002551, 81.37076193948232, 81.69886985052861, 80.24061246810062, 81.00619759387531, 80.34998177178271, 80.20415603353992, 81.69886985052861, 81.51658767772511, 80.64163324826832, 80.02187386073642, 80.02187386073642, 80.02187386073642, 80.02187386073642, 80.34998177178271, 80.27706890266131, 80.02187386073642, 80.16769959897921, 80.82391542107182, 81.29784907036091, 81.77178271965002, 80.75100255195042, 81.84469558877142, 80.53226394458622, 80.02187386073642, 80.86037185563251, 80.09478672985782, 81.18847976667881, 81.15202333211812, 80.31352533722202, 82.28217280349982, 82.02697776157491, 81.48013124316441, 80.64163324826832, 80.89682829019321, 81.11556689755741, 81.26139263580022, 80.64163324826832, 80.64163324826832, 80.45935107546481, 80.86037185563251, 80.31352533722202, 80.05833029529711, 81.40721837404301, 81.00619759387531, 81.77178271965002, 80.96974115931461, 81.22493620123952, 81.37076193948232, 80.49580751002551, 80.05833029529711, 80.89682829019321, 81.44367480860372, 80.02187386073642, 80.02187386073642, 81.55304411228582, 80.67808968282901, 80.49580751002551, 81.26139263580022, 80.02187386073642, 80.27706890266131, 80.42289464090412, 80.45935107546481, 81.55304411228582, 81.77178271965002, 80.45935107546481, 81.73532628508931, 80.75100255195042, 83.04775792927451, 80.45935107546481, 80.02187386073642, 80.02187386073642, 81.04265402843602, 81.51658767772511, 80.89682829019321, 81.58950054684651] 最小溶液百分比=80.02187386073642 解决方案有2743行 考虑有4267行 完成了3个循环,每个循环结束时完成:[274227432743]行

real 5m57.637s 用户5m57.446s
sys 0m0.335s在我看来是NP难的。我会选择类似于子集和和整数线性规划的近似模式。另一方面,DP可能会有所帮助;你能展示一下你在DP方面的尝试吗?@G.Bach我已经想到了DP,也试图将实际问题简化为可使用DP的问题。但目前我是最肯定的是,如果不是肯定的话,DP不会有帮助:-(一种方法是有界搜索,从最可能的候选行开始,向下搜索列表,查看可以添加哪些其他行。首先按设置的位数对行进行排序,然后依次检查每一行,查看是否可以在不破坏约束的情况下将其添加到解决方案中。继续向下运行尚未添加的行列表,直到l您已在未向解决方案中添加新行的情况下运行了列表。这将生成一个解决方案。如果您想尝试第二次通过,请计算每列中设置的位数,并对每行的分数进行加权,使设置了位数较低的行排名更高(以平衡百分比)。以此为起点,您可以决定搜索更好的解决方案所需的额外时间。其他起点包括从选定的所有行开始,并尝试查找要删除的最小行,可能会更多地考虑得分最差的列中带零的行。
#!/usr/bin/env ruby
data = [
[0, 1, 1, 1, 1 ],
[1, 1, 1, 1, 1 ],
[1, 0, 1, 0, 1 ],
[1, 1, 1, 1, 1 ],
[0, 1, 1, 0, 1 ],
[1, 1, 0, 1, 0 ]
]

# array with blocks of different average densities
data = 990.times.collect do
         limit = rand(1000)
         100.times.collect do
          rand(1000) <= limit ? 1 : 0
        end
      end + 10.times.collect { 100.times.collect { 1 } }

#puts "data = #{data.inspect}"

def sum(list)
  list.inject(0){|res,v| res + v}
end

def column_percent(array)
  multiplier = 100.0 / array.count
  array.transpose.collect{|column| sum(column) * multiplier}
end

sorted_data = data.sort{|a,b| sum(b) <=> sum(a)}

#puts sorted_data.inspect
puts "Data percentages: #{column_percent(data).inspect}"
puts "Average over data: #{column_percent(data).min.inspect}"


solution = [ ]
consider = sorted_data
discarded = [ ]
loops = 0
done_something = true
achieved = [ ]
while (done_something)
  loops += 1
  done_something = false
  while (!consider.empty?)
    row = consider.shift
    #puts "Considerring: #{row.inspect}"
    if column_percent(solution + [ row ]).min >= 80.0
      done_something = true
      solution.push row
    else
      discarded.push row
    end
  end
  achieved << solution.count
  consider = discarded
  discarded = [ ]
end

puts "solution: #{solution.inspect}" if solution.count < 10
puts "solution.percents = #{column_percent(solution).inspect}"
puts "min solution.percents = #{column_percent(solution).min.inspect}"
puts "solution has #{solution.count.inspect} rows"
puts "consider has #{consider.count.inspect} rows"
puts "went through #{loops} loops, achievment at end of each loop: #{achieved.inspect} rows"

exit
Data percentages: [66.66666666666667, 83.33333333333334, 83.33333333333334, 66.66666666666667, 83.33333333333334]
Average over data: 66.66666666666667
solution: [[1, 1, 1, 1, 1], [1, 1, 1, 1, 1]]
solution.percents = [100.0, 100.0, 100.0, 100.0, 100.0]
min solution.percents = 100.0
solution has 2 rows
consider has 4 rows
went through 2 loops, achievment at end of each loop: [2, 2] rows


$ time ruby code.rb # 1000 rows

Data percentages: [50.900000000000006, 48.900000000000006, 50.2, 49.7, 47.5, 50.800000000000004, 50.2, 48.900000000000006, 51.300000000000004, 49.1, 49.900000000000006, 48.7, 49.7, 48.900000000000006, 50.300000000000004, 52.400000000000006, 51.0, 49.900000000000006, 50.800000000000004, 49.6, 49.0, 50.1, 49.1, 48.7, 50.800000000000004, 49.0, 49.2, 49.900000000000006, 48.800000000000004, 50.1, 50.2, 49.6, 49.900000000000006, 50.2, 50.900000000000006, 49.2, 51.7, 49.300000000000004, 48.400000000000006, 49.400000000000006, 49.5, 49.6, 47.7, 50.0, 46.900000000000006, 51.0, 50.0, 51.5, 50.5, 49.300000000000004, 49.1, 50.400000000000006, 47.800000000000004, 51.800000000000004, 50.2, 49.400000000000006, 49.400000000000006, 49.0, 51.5, 48.0, 53.7, 49.1, 51.300000000000004, 50.400000000000006, 50.800000000000004, 48.900000000000006, 50.6, 47.0, 50.300000000000004, 49.400000000000006, 50.800000000000004, 51.300000000000004, 52.900000000000006, 50.0, 51.300000000000004, 47.800000000000004, 51.300000000000004, 47.6, 49.900000000000006, 54.5, 49.5, 51.800000000000004, 50.800000000000004, 50.400000000000006, 51.0, 50.1, 47.7, 49.6, 53.300000000000004, 50.2, 49.7, 51.5, 47.900000000000006, 49.7, 48.0, 48.6, 49.6, 48.900000000000006, 50.1, 50.7]
Average over data: 46.900000000000006
solution.percents = [84.02366863905326, 83.13609467455622, 85.50295857988166, 81.65680473372781, 80.17751479289942, 82.54437869822486, 83.13609467455622, 81.95266272189349, 82.54437869822486, 80.4733727810651, 87.27810650887574, 82.84023668639054, 83.4319526627219, 82.54437869822486, 80.76923076923077, 84.31952662721893, 81.36094674556213, 85.79881656804734, 82.24852071005917, 83.72781065088758, 81.65680473372781, 82.24852071005917, 80.76923076923077, 82.54437869822486, 85.20710059171599, 83.72781065088758, 80.17751479289942, 83.72781065088758, 82.84023668639054, 81.95266272189349, 84.61538461538461, 80.17751479289942, 81.95266272189349, 81.36094674556213, 84.02366863905326, 84.61538461538461, 84.02366863905326, 83.72781065088758, 82.24852071005917, 84.31952662721893, 84.02366863905326, 84.02366863905326, 80.17751479289942, 82.84023668639054, 80.4733727810651, 82.84023668639054, 83.13609467455622, 82.84023668639054, 80.17751479289942, 80.17751479289942, 82.84023668639054, 83.72781065088758, 80.17751479289942, 81.95266272189349, 81.95266272189349, 82.84023668639054, 80.76923076923077, 81.95266272189349, 81.95266272189349, 82.84023668639054, 85.20710059171599, 83.4319526627219, 83.72781065088758, 80.17751479289942, 84.31952662721893, 82.54437869822486, 86.09467455621302, 81.95266272189349, 82.54437869822486, 81.95266272189349, 81.95266272189349, 83.72781065088758, 83.4319526627219, 84.61538461538461, 86.68639053254438, 81.06508875739645, 83.4319526627219, 80.76923076923077, 80.76923076923077, 85.79881656804734, 82.84023668639054, 85.79881656804734, 84.31952662721893, 82.24852071005917, 84.02366863905326, 80.76923076923077, 80.17751479289942, 84.9112426035503, 83.72781065088758, 84.61538461538461, 83.13609467455622, 84.61538461538461, 84.61538461538461, 82.54437869822486, 80.76923076923077, 82.84023668639054, 80.4733727810651, 80.17751479289942, 82.84023668639054, 80.17751479289942]
min solution.percents = 80.17751479289942
solution has 338 rows
consider has 662 rows
went through 3 loops, achievment at end of each loop: [337, 338, 338] rows

real    0m7.588s
user    0m7.435s
sys 0m0.142s