用大链接表优化mysql分组
我已经读了很多关于这个的文章,但是每次查询都会有30多秒的时间,而我确信它的运行速度会快很多 问题是: 具有如下定义的大型链接表(4000万行,数据由650MB和表示1.8GB的索引组成):用大链接表优化mysql分组,mysql,group-by,many-to-many,Mysql,Group By,Many To Many,我已经读了很多关于这个的文章,但是每次查询都会有30多秒的时间,而我确信它的运行速度会快很多 问题是: 具有如下定义的大型链接表(4000万行,数据由650MB和表示1.8GB的索引组成): CREATE TABLE IF NOT EXISTS `glossary_entry_wordList_1` ( `idTerm` mediumint(8) unsigned NOT NULL, `idKeyword` mediumint(8) unsigned NOT NULL, `termL
CREATE TABLE IF NOT EXISTS `glossary_entry_wordList_1` (
`idTerm` mediumint(8) unsigned NOT NULL,
`idKeyword` mediumint(8) unsigned NOT NULL,
`termLength` smallint(6) NOT NULL,
`termNumberWords` tinyint(4) NOT NULL,
`termTransliteralRFC` mediumint(9) NOT NULL,
`keywordLength` tinyint(3) unsigned NOT NULL,
`termLanguage` tinyint(4) NOT NULL,
PRIMARY KEY (`idKeyword`,`idTerm`),
KEY `termTransliteralRFC` (`termTransliteralRFC`),
KEY `termLength` (`termLength`),
KEY `secondPrimary` (`idTerm`,`idKeyword`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci
CREATE TEMPORARY TABLE IF NOT EXISTS `foundIDs` (
`searchId` int(11) NOT NULL,
`searchedKeywordId` int(11) NOT NULL,
`similarKeywordId` mediumint(8) unsigned NOT NULL,
`partsMatched` tinyint(4) NOT NULL,
`sumSimliarParts` int(11) NOT NULL,
`keywordLength` int(11) NOT NULL,
`fuzzyMark` float NOT NULL,
`keywordDjb2` bigint(20) NOT NULL,
`smallKeyword` tinyint(4) NOT NULL,
PRIMARY KEY (`similarKeywordId`),
KEY `searchId` (`searchId`),
KEY `searchedKeywordId` (`searchedKeywordId`),
KEY `partsMatched` (`partsMatched`),
KEY `keywordLength` (`keywordLength`),
KEY `smallKeyword` (`smallKeyword`),
KEY `keywordDjb2` (`keywordDjb2`)
) ENGINE=MEMORY DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
和一个小型临时表,定义如下:
CREATE TABLE IF NOT EXISTS `glossary_entry_wordList_1` (
`idTerm` mediumint(8) unsigned NOT NULL,
`idKeyword` mediumint(8) unsigned NOT NULL,
`termLength` smallint(6) NOT NULL,
`termNumberWords` tinyint(4) NOT NULL,
`termTransliteralRFC` mediumint(9) NOT NULL,
`keywordLength` tinyint(3) unsigned NOT NULL,
`termLanguage` tinyint(4) NOT NULL,
PRIMARY KEY (`idKeyword`,`idTerm`),
KEY `termTransliteralRFC` (`termTransliteralRFC`),
KEY `termLength` (`termLength`),
KEY `secondPrimary` (`idTerm`,`idKeyword`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci
CREATE TEMPORARY TABLE IF NOT EXISTS `foundIDs` (
`searchId` int(11) NOT NULL,
`searchedKeywordId` int(11) NOT NULL,
`similarKeywordId` mediumint(8) unsigned NOT NULL,
`partsMatched` tinyint(4) NOT NULL,
`sumSimliarParts` int(11) NOT NULL,
`keywordLength` int(11) NOT NULL,
`fuzzyMark` float NOT NULL,
`keywordDjb2` bigint(20) NOT NULL,
`smallKeyword` tinyint(4) NOT NULL,
PRIMARY KEY (`similarKeywordId`),
KEY `searchId` (`searchId`),
KEY `searchedKeywordId` (`searchedKeywordId`),
KEY `partsMatched` (`partsMatched`),
KEY `keywordLength` (`keywordLength`),
KEY `smallKeyword` (`smallKeyword`),
KEY `keywordDjb2` (`keywordDjb2`)
) ENGINE=MEMORY DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
我需要从glossary\u entry\u wordList\u 1
中检索与表foundIDs
中至少50%的idKeyword
相关联的所有idTerm
实际上,我需要找到所有包含x个单词的句子
为此,我使用这样的查询(请注意,这里的条件数据只是通过示例):
引擎的行为如下所示:
-字长越小(1-2个字母),查询响应越长(显然,因为它们有更多的关联)
-搜索表(FoundID)中的单词越多,查询越长
关于如何改进查询响应有什么想法吗
谢谢,您能否确认
EXPLAIN
输出与您在问题中描述的查询和表格规格以及问题相匹配?据此,MySQL只需检查8x146行。或者我在解释我看到的东西时有困难吗?我相信你应该尝试其他方法来解决你的项目,也许MongoDB会帮助你解决问题。Sylvain,这确实是这种情况下的结果。我忘了提到FoundID只有4行,对于这个显式案例(搜索)。