Php 如何计算tf-idf?

Php 如何计算tf-idf?,php,mysql,full-text-search,tf-idf,Php,Mysql,Full Text Search,Tf Idf,我有一个问题,我不能用我的实际代码计算tf idf 这是tf idf的一个示例: $tfidf = $term_frequency * // tf log( $total_document_count / $documents_with_term, 2); // idf 我有全部文档,但我需要$documents\u,带\u term和$term\u frequency 这是我的实际代码: $frase = htmlspecialchars($_GET['frase'], E

我有一个问题,我不能用我的实际代码计算tf idf

这是tf idf的一个示例:

$tfidf = $term_frequency *  // tf
        log( $total_document_count / $documents_with_term, 2); // idf
我有全部文档,但我需要$documents\u,带\u term和$term\u frequency

这是我的实际代码:

$frase = htmlspecialchars($_GET['frase'], ENT_NOQUOTES);

$sssql = $server_link->query("SELECT uDR.webTitulo, uDR.webDescripcion, uDR.webkeywords, uDR.weburl, SUM(uDR.priority) as SPriority
FROM (

(SELECT s1.webTitulo, s1.webDescripcion, s1.weburl, s1.webkeywords, $a as priority FROM webs s1 WHERE MATCH (webTitulo) AGAINST ('$frase'))

UNION

(SELECT s2.webTitulo, s2.webDescripcion, s2.weburl, s2.webkeywords, $b as priority FROM webs s2 WHERE MATCH (webkeywords) AGAINST ('$frase*' IN BOOLEAN MODE))

UNION

(SELECT s3.webTitulo, s3.webDescripcion, s3.weburl, s3.webkeywords, $c as priority FROM webs s3 WHERE MATCH (webDescripcion) AGAINST ('$frase'))) uDR

GROUP BY uDR.webTitulo, uDR.weburl, uDR.webDescripcion, uDR.webkeywords

ORDER BY SPriority DESC ");

$totalRows = $sssql->num_rows; //This is the $total_document_count
我有$total\u document\u计数,但我不知道如何使用\u术语提取TF和$documents\u

如何提取它们?

  • 在您的示例中,$totalRows是带有术语的文档
  • 总文档数将是SELECT count(*)来自Web的文档总数
  • 使用SQL来计算术语频率要困难一些。见:
查看以下答案: