Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/sql/76.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql 我该如何在这个统计数据上加入这一点?_Sql_Postgresql_Join_Postgresql 9.1 - Fatal编程技术网

Sql 我该如何在这个统计数据上加入这一点?

Sql 我该如何在这个统计数据上加入这一点?,sql,postgresql,join,postgresql-9.1,Sql,Postgresql,Join,Postgresql 9.1,首先,我对问题的标题感到抱歉。我不懂统计学术语,也不懂这种连接困难 我有一个查询*,通过它我基本上生成了三件事。。arandom\u sex,random\u first和random\u last。我现在正试着加入我们 基本上,人口普查数据放在这样的表格中 name | freq | cumfreq | rank | name_type ------------+-------+---------+------+----------- SMITH | 1.006

首先,我对问题的标题感到抱歉。我不懂统计学术语,也不懂这种连接困难

我有一个查询*,通过它我基本上生成了三件事。。a
random\u sex
random\u first
random\u last
。我现在正试着加入我们

基本上,人口普查数据放在这样的表格中

    name    | freq  | cumfreq | rank | name_type 
------------+-------+---------+------+-----------
 SMITH      | 1.006 |   1.006 |    1 | LAST
 JOHNSON    |  0.81 |   1.816 |    2 | LAST
 WILLIAMS   | 0.699 |   2.515 |    3 | LAST
 JONES      | 0.621 |   3.136 |    4 | LAST
 BROWN      | 0.621 |   3.757 |    5 | LAST
 DAVIS      |  0.48 |   4.237 |    6 | LAST
 MILLER     | 0.424 |    4.66 |    7 | LAST
 WILSON     | 0.339 |       5 |    8 | LAST
 MOORE      | 0.312 |   5.312 |    9 | LAST
 TAYLOR     | 0.311 |   5.623 |   10 | LAST
 ANDERSON   | 0.311 |   5.934 |   11 | LAST
 THOMAS     | 0.311 |   6.245 |   12 | LAST
 JACKSON    |  0.31 |   6.554 |   13 | LAST
 WHITE      | 0.279 |   6.834 |   14 | LAST
 HARRIS     | 0.275 |   7.109 |   15 | LAST
 MARTIN     | 0.273 |   7.382 |   16 | LAST
 THOMPSON   | 0.269 |   7.651 |   17 | LAST
 GARCIA     | 0.254 |   7.905 |   18 | LAST
 MARTINEZ   | 0.234 |    8.14 |   19 | LAST
在这种情况下

 random_sex |   random_first   |    random_last    
 male       | 47.7101715711225 | 24.3833348881337
我希望它像这样连接(程序性):

所以这个绅士的名字应该是银哈珀。我这辈子没见过一个,但是

我想在上面的查询中返回“Silver”“Harper”,而不是随机数。我怎样才能让它像这样工作


脚注

*:为了简单起见:

SELECT
   CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS random_sex
   , RANDOM() * 90.020 AS random_first -- dataset is 90% of most popular
   , RANDOM() * 90.483 AS random_last
FROM generate_series(1,10,1);

事实上,我也不知道统计数字。但我想这就是你想要的

让我们命名返回随机列的表
Randoms

WITH RANDOMS AS
(
   SELECT
   CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS random_sex
   , RANDOM() * 90.020 AS random_first 
   , RANDOM() * 90.483 AS random_last
   FROM generate_series(1,10,1)
)
SELECT (
        SELECT A.NAME 
        FROM census.names A
        WHERE A.cumfreq > R.random_first
        AND A.name_type = 'MALE_FIRST'
        order by A.cumfreq asc limit 1
       ), 
       (
        SELECT A.NAME 
        FROM census.names A
        WHERE A.cumfreq > R.random_last
        AND A.name_type = 'LAST'
        order by A.cumfreq asc limit 1
       ) AS NAME
FROM RANDOMS R ;

相关子查询

SELECT
  *
FROM
  yourRandomTable
INNER JOIN
  census.names         AS first_name
    ON  first_name.cumfreq = (SELECT MIN(cumfreq)
                              FROM   census.names
                              WHERE  cumfreq > yourRandomTable.random_first
                                AND  type    = yourRandomTable.random_sex + '_FIRST')
    AND first_name.type    = yourRandomTable.random_sex + '_FIRST'
INNER JOIN
  census.names         AS last_name
    ON  last_name.cumfreq  = (SELECT MIN(cumfreq)
                              FROM   census.names
                              WHERE  cumfreq > yourRandomTable.random_last
                                AND  type    = 'LAST')
    AND last_name.type     = 'LAST'
你可以改变这种模式很多。具体的选择方式取决于您如何设置索引

EXPLAIN ANALYZE SELECT
  r.sex
  , r.detail
  , COALESCE(
    (SELECT name FROM census.names AS mf WHERE r.sex = 'male' AND mf.name_type = 'MALE_FIRST' AND mf.cumfreq > r.first ORDER BY cumfreq LIMIT 1)
    , (SELECT name FROM census.names AS ff WHERE r.sex = 'female' AND ff.name_type = 'FEMALE_FIRST' AND ff.cumfreq > r.first ORDER BY cumfreq LIMIT 1)
  ) AS first
  , (SELECT name FROM census.names AS l WHERE l.name_type = 'LAST' AND l.cumfreq > r.last ORDER BY cumfreq LIMIT 1) AS last
FROM (
  SELECT
    RANDOM() * 90.020 AS first
    , RANDOM() * 90.483 AS last
    , CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS sex
  FROM generate_series(1,10,1)
) AS r;
这就是我最终的结果

SELECT
  *
FROM
  yourRandomTable
INNER JOIN
  census.names         AS first_name
    ON  first_name.cumfreq = (SELECT MIN(cumfreq)
                              FROM   census.names
                              WHERE  cumfreq > yourRandomTable.random_first
                                AND  type    = yourRandomTable.random_sex + '_FIRST')
    AND first_name.type    = yourRandomTable.random_sex + '_FIRST'
INNER JOIN
  census.names         AS last_name
    ON  last_name.cumfreq  = (SELECT MIN(cumfreq)
                              FROM   census.names
                              WHERE  cumfreq > yourRandomTable.random_last
                                AND  type    = 'LAST')
    AND last_name.type     = 'LAST'
EXPLAIN ANALYZE SELECT
  r.sex
  , r.detail
  , COALESCE(
    (SELECT name FROM census.names AS mf WHERE r.sex = 'male' AND mf.name_type = 'MALE_FIRST' AND mf.cumfreq > r.first ORDER BY cumfreq LIMIT 1)
    , (SELECT name FROM census.names AS ff WHERE r.sex = 'female' AND ff.name_type = 'FEMALE_FIRST' AND ff.cumfreq > r.first ORDER BY cumfreq LIMIT 1)
  ) AS first
  , (SELECT name FROM census.names AS l WHERE l.name_type = 'LAST' AND l.cumfreq > r.last ORDER BY cumfreq LIMIT 1) AS last
FROM (
  SELECT
    RANDOM() * 90.020 AS first
    , RANDOM() * 90.483 AS last
    , CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS sex
  FROM generate_series(1,10,1)
) AS r;