Python 搜索引擎-通过加权机制对输出进行排序_Python_Tensorflow_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch_Word Embedding_Sentence Similarity

Python 搜索引擎-通过加权机制对输出进行排序

python tensorflow

Python 搜索引擎-通过加权机制对输出进行排序,python,tensorflow,elasticsearch,word-embedding,sentence-similarity,Python,Tensorflow,elasticsearch,Word Embedding,Sentence Similarity,我正在尝试使用Elastic 7.7.0和Universal Sequence Encoder（USE4）单词嵌入构建一个语义搜索常见问题解答系统，到目前为止，我已经为一组问题和答案编制了索引，我可以进行搜索。每当有输入时，我会进行2次搜索：基于索引数据的弹性搜索使用USE4嵌入进行语义搜索现在我想将两者结合起来，以提供健壮的输出，因为有时结果与这些单独的算法不同。关于如何将它们结合起来，有什么好的建议吗？使用加权机制为语义搜索赋予更多权重，和/或能够再次匹配它们。问题是我怎样才能两者兼得

我正在尝试使用Elastic 7.7.0和Universal Sequence Encoder（USE4）单词嵌入构建一个语义搜索常见问题解答系统，到目前为止，我已经为一组问题和答案编制了索引，我可以进行搜索。每当有输入时，我会进行2次搜索：

基于索引数据的弹性搜索

使用USE4嵌入进行语义搜索

现在我想将两者结合起来，以提供健壮的输出，因为有时结果与这些单独的算法不同。关于如何将它们结合起来，有什么好的建议吗？使用加权机制为语义搜索赋予更多权重，和/或能够再次匹配它们。问题是我怎样才能两者兼得。请告知

import time
import sys
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk
import csv
import tensorflow as tf
import tensorflow_hub as hub

def connect2ES():
    # connect to ES on localhost on port 9200
    es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
    if es.ping():
            print('Connected to ES!')
    else:
            print('Could not connect!')
            sys.exit()

    print("*********************************************************************************");
    return es

def keywordSearch(es, q):
    #Search by Keywords
    b={
            'query':{
                'match':{
                    "title":q
                }
            }
        }

    res= es.search(index='questions-index_quora2',body=b)
    print("Keyword Search:\n")
    for hit in res['hits']['hits']:
        print(str(hit['_score']) + "\t" + hit['_source']['title'] )

    print("*********************************************************************************");

    return


# Search by Vec Similarity
def sentenceSimilaritybyNN(embed, es, sent):
    query_vector = tf.make_ndarray(tf.make_tensor_proto(embed([sent]))).tolist()[0]
    b = {"query" : {
                "script_score" : {
                    "query" : {
                        "match_all": {}
                    },
                    "script" : {
                        "source": "cosineSimilarity(params.query_vector, 'title_vector') + 1.0",
                        "params": {"query_vector": query_vector}
                    }
                }
             }
        }


    #print(json.dumps(b,indent=4))
    res= es.search(index='questions-index_quora2',body=b)

    print("Semantic Similarity Search:\n")
    for hit in res['hits']['hits']:
        print(str(hit['_score']) + "\t" + hit['_source']['title'] )

    print("*********************************************************************************");



if __name__=="__main__":

    es = connect2ES();
    embed = hub.load("./data/USE4/") #this is where my USE4 Model is saved.


    while(1):
        query=input("Enter a Query:");

        start = time.time()
        if query=="END":
            break;

        print("Query: " +query)
        keywordSearch(es, query)
        sentenceSimilaritybyNN(embed, es, query)

        end = time.time()
        print(end - start)

我的输出如下所示：

Enter a Query:what can i watch this weekend
Query: what can i watch this weekend
Keyword Search:

9.6698  Where can I watch gonulcelen with english subtitles?
7.114256    What are some good movies to watch?
6.3105774   What kind of animal did this?
6.2754908   What are some must watch TV shows before you die?
6.0294256   What is the painting on this image?
6.0294256   What the meaning of this all life?
6.0294256   What are your comments on this picture?
5.9638205   Which is better GTA5 or Watch Dogs?
5.9269657   Can somebody explain to me how to do this problem with steps?
*********************************************************************************
Semantic Similarity Search:

1.6078881   What are some good movies to watch?
1.5065247   What are some must watch TV shows before you die?
1.502714    What are some movies that everyone needs to watch at least once in life?
1.4787409   Where can I watch gonulcelen with english subtitles?
1.4713362   What are the best things to do on Halloween?
1.4669418   Which are the best movies of 2016?
1.4554278   What are some interesting things to do when bored?
1.4307204   How can I improve my skills?
1.4261798   What are the best films that take place in one room?
1.4175651   What are the best things to learn in life?
*********************************************************************************
0.05920886993408203

我想要一个基于这两者的输出，在那里我们可以得到更准确的结果，并相应地对它们进行排序。请建议或重定向，我可以参考一些关于这方面的好做法。提前谢谢。

这似乎太宽泛/模糊了。请看，。这似乎太宽泛/模糊了。请看。