Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/mysql/63.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Mysql 使用大型数据集和ruby_Mysql_Ruby On Rails 3 - Fatal编程技术网

Mysql 使用大型数据集和ruby

Mysql 使用大型数据集和ruby,mysql,ruby-on-rails-3,Mysql,Ruby On Rails 3,我真的需要一些帮助。正在努力显示包含大量数据的仪表板 使用@2k记录时,平均时间为@2秒 MySql控制台中的查询用不到3.5秒的时间返回150k行。Ruby中的同一个查询从执行查询到所有对象就绪需要4分钟以上的时间 目标:在添加缓存服务器之前进一步优化数据。使用Ruby 1.9.2、Rails 3.0和Mysql(Mysql2-gem) 问题: 使用哈希会影响性能吗 我是否应该首先将所有内容放在一个主散列中,然后再操作我需要的数据 我还能做些什么来帮助提高性能吗 以DB为单位的行: 加气

我真的需要一些帮助。正在努力显示包含大量数据的仪表板

使用@2k记录时,平均时间为@2秒

MySql控制台中的查询用不到3.5秒的时间返回150k行。Ruby中的同一个查询从执行查询到所有对象就绪需要4分钟以上的时间

目标:在添加缓存服务器之前进一步优化数据。使用Ruby 1.9.2、Rails 3.0和Mysql(Mysql2-gem)

问题:

  • 使用哈希会影响性能吗
  • 我是否应该首先将所有内容放在一个主散列中,然后再操作我需要的数据
  • 我还能做些什么来帮助提高性能吗
以DB为单位的行:

  • 加气站和美国人口普查有@150000个记录
  • 此人有@100000条记录
  • 汽车有@200000的记录
  • 填充量为@230万
仪表板必需(基于最近24小时、上周等时间段的查询)。JS以JSON格式返回的所有数据

  • 加油站,包括加油站和美国人口普查数据(邮政编码、姓名、城市、人口)
  • 加油最多的前20个城市
  • 加油量最高的10辆车
  • 汽车按加油次数分组
代码(6个月样本。返回@100k+记录):


我不使用RoR,但是为仪表板返回10万行永远不会很快。我强烈建议您建立或维护汇总表,并在数据库中运行
GROUP BY
s,以在演示之前汇总您的数据集。

在此挑剔:
Car
GasStation
FillUp
。我觉得Ruby的惯例好得多:
car
gas\u station
fill\u-up
是否要求首先返回整个数据集,而不是像Nope这样的东西,可以批量查找,但Ruby方法非常慢。将尝试报告结果。至于“更好”,我同意并且更容易阅读,但性能是关键OK,从美国人口普查表中快速测试了id、city和fipscode,共149318行:--标准Ruby:149318和Take:6.144613,-分批查找:149318和Take:6.642666,-Mysql GEM原始查询:149318和Take:1.851102我们尝试了“汇总”表,但这开始变得混乱,因为我们每15分钟写入大量新数据,因此所有汇总也需要每15分钟更新一次。不过,每15分钟可能比每次更新仪表板要好。不管你做什么,我都会尽量把汇总放在数据库端。
# for simplicity, removed the select clause I had, but removing data I don't need like updated_at, gas_station.created_at, etc. instead of returning all the columns for each table.
@primary_data = FillUp.includes([:car, :gas_staton, :gas_station => {:uscensus}]).where('fill_ups.created_at >= ?', 6.months.ago) # This would take @ 4 + minutes

# then tried

@primary_data = FillUp.find_by_sql('some long sql query...') # took longer than before.
# Note for others, sql query did some pre processing for me which added attributes to the return.  Query in DB Console took < 4 seconds.  Because of these extra attributes, query took longer as if Ruby was checking each row for mapping attributes

# then tried

MY_MAP = Hash[ActiveRecord::Base.connection.select_all('SELECT thingone, thingtwo from table').map{|one| [one['thingone'], one['thingtwo']]}] as seen http://stackoverflow.com/questions/4456834/ruby-on-rails-storing-and-accessing-large-data-sets
# that took 23 seconds and gained mapping of additional data that was processing later, so much faster

# currently using below which takes @ 10 seconds
# All though this is faster, query still only takes 3.5 seconds, but parsing it to the hashes does add overhead.
cars = {}
gasstations = {}
cities = {}
filled = {}

client = Mysql2::Client.new(:host => "localhost", :username => "root")
client.query("SELECT sum(fill_ups_grouped_by_car_id) as filled, fillups.car_id, cars.make as make, gasstations.name as name,  ....", :stream => true, :as => :json).each do |row|
  # this returns fill ups gouged by car ,fill_ups.car_id, car make, gas station name, gas station zip, gas station city, city population 
  if cities[row['city']]
    cities[row['city']]['fill_ups']  = (cities[row['city']]['fill_ups']  + row['filled'])
  else
    cities[row['city']] = {'fill_ups' => row['filled'], 'population' => row['population']}
  end
  if gasstations[row['name']]
    gasstations[row['name']]['fill_ups'] = (gasstations[row['name']]['fill_ups'] + row['filled'])
  else
    gasstations[row['name']] = {'city' => row['city'],'zip' => row['city'], 'fill_ups' => row['filled']}
  end
  if cars[row['make']]
    cars[row['make']] = (cars[row['make']] + row['filled'])
  else
    cars[row['make']] = row['filled']
  end
  if row['filled']
    filled[row['filled']] = (filled[row['filled']] + 1)
  else
    filled[row['filled']] = 1
  end
end
def Person
 has_many :cars 
end

def Car
  belongs_to :person
  belongs_to :uscensus, :foreign_key => :zipcode, :primary_key => :zipcode
  has_many :fill_ups
  has_many :gas_stations, :through => :fill_ups
end

def GasStation
  belongs_to :uscensus, :foreign_key => :zipcode, :primary_key => :zipcode
  has_many :fill_ups
  has_many :cars, :through => :fill_ups
end

def FillUp
  # log of every time a person fills up there gas
  belongs_to :car
  belongs_to :gas_station
end

def Uscensus
  # Basic data about area based on Zip code
end