Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/sql/71.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
C# 用LINQ优化一种基于值的搜索算法_C#_Sql_Algorithm_Linq - Fatal编程技术网

C# 用LINQ优化一种基于值的搜索算法

C# 用LINQ优化一种基于值的搜索算法,c#,sql,algorithm,linq,C#,Sql,Algorithm,Linq,我想建立一个基于值的搜索算法。这意味着,一旦给我一个单词列表,我想用这些单词在数据库中搜索条目。但是,根据这些单词匹配的列/属性,我想更改返回结果的值 这是一个懒惰的算法,它实现了这一点,但速度非常慢 //search only active entries var query = (from a in db.Jobs where a.StatusId == 7 select a); List<SearchResult> baseResult = new List<Search

我想建立一个基于值的搜索算法。这意味着,一旦给我一个单词列表,我想用这些单词在数据库中搜索条目。但是,根据这些单词匹配的列/属性,我想更改返回结果的值

这是一个懒惰的算法,它实现了这一点,但速度非常慢

//search only active entries
var query = (from a in db.Jobs where a.StatusId == 7 select a);
List<SearchResult> baseResult = new List<SearchResult>();
foreach (var item in search)
            {
               //if the company title is matched, results are worth 5 points
                var companyMatches = (from a in query where a.Company.Name.ToLower().Contains(item.ToLower()) select new SearchResult() { ID = a.ID, Value = 5 });

                //if the title is matched results are worth 3 points
                var titleMatches = (from a in query where a.Title.ToLower().Contains(item.ToLower()) select new SearchResult() { ID = a.ID, Value = 3 });

                //if text within the body is matched results are worth 2 points
                var bodyMatches = (from a in query where a.FullDescription.ToLower().Contains(item.ToLower()) select new SearchResult() { ID = a.ID, Value = 2 });


                 //all results are then added 
                baseResult = baseResult.Concat(companyMatches.Concat(titleMatches).Concat(bodyMatches)).ToList();
            }

              // the value gained for each entry is then added and sorted by highest to lowest
            List<SearchResult> result = baseResult.GroupBy(x => x.ID).Select(p => new SearchResult() { ID = p.First().ID, Value = p.Sum(i => i.Value) }).OrderByDescending(a => a.Value).ToList<SearchResult>();

            //the query for the complete result set is built based on the sorted id value of result
            query = (from id in result join jbs in db.Jobs on id.ID equals jbs.ID select jbs).AsQueryable();
//只搜索活动条目
var query=(从db.Jobs中的a开始,其中a.StatusId==7选择a);
List baseResult=新列表();
foreach(搜索中的变量项)
{
//如果公司名称匹配,结果值为5分
var companyMatches=(从a.Company.Name.ToLower()包含(item.ToLower())的in查询中选择new SearchResult(){ID=a.ID,Value=5});
//如果标题匹配,结果值为3分
var titleMatches=(从a.Title.ToLower()包含(item.ToLower())的in查询中选择new SearchResult(){ID=a.ID,Value=3});
//如果正文中的文本匹配,则结果值为2分
var bodyMatches=(从a.FullDescription.ToLower().Contains(item.ToLower())包含的in查询中选择new SearchResult(){ID=a.ID,Value=2});
//然后添加所有结果
baseResult=baseResult.Concat(companyMatches.Concat(titleMatches.Concat(bodyMatches)).ToList();
}
//然后将每个条目获得的值相加,并按从高到低排序
List result=baseResult.GroupBy(x=>x.ID).Select(p=>newsearchresult(){ID=p.First().ID,Value=p.Sum(i=>i.Value)}).OrderByDescending(a=>a.Value).ToList();
//完整结果集的查询是基于结果的排序id值构建的
query=(从结果中的id连接id.id上的db.Jobs中的jbs等于jbs.id选择jbs);

我正在寻找优化的方法。我是LINQ query的新手,所以我希望能得到一些帮助。如果有机会的话,我可以创建一个LINQ查询,一次完成所有这些,而不是检查公司名称、标题和正文文本,然后将它们放在一起,创建一个排序列表,然后再次对数据库运行以获得完整的列表,这将是非常棒的

我最好先研究这个问题。我之前的回答是优化了错误的东西。这里的主要问题是多次查看结果列表。我们可以改变这一点:

foreach (var a in query)
{
    foreach (var item in search)
    {
        itemLower = item.ToLower();
        int val = 0;
        if (a.Company.Name.ToLower.Contains(itemLower))
            baseResult.Add(new SearchResult { ID = a.ID, Value = 5});
        if (a.Title.ToLower.Contains(itemLower))
            baseResult.Add(new SearchResult { ID = a.ID, Value = 3});
        if (a.FullDescription.ToLower().Contains(itemLower))
            baseResult.Add(new SearchResult { ID = a.ID, Value = 2});
    }
}
之后,您就有了基本结果,可以继续处理

这将每个搜索项减少为一个查询,而不是三个查询

我不确定您是否希望在
baseResult
中使用唯一的项,或者是否有某种原因允许重复项,然后使用值的总和对它们进行排序。如果您想要唯一的项目,可以将
baseResult
制作成
字典
,ID作为键

评论后编辑 您可以通过执行以下操作来减少列表中的项目数:

int val = 0;
if (a.Company.Name.ToLower.Contains(itemLower))
    val += 5;
if (a.Title.ToLower.Contains(itemLower))
    val += 3;
if (a.FullDescription.ToLower().Contains(itemLower))
    val += 2;
if (val > 0)
{
    baseResult.Add(new SearchResult { ID = a.ID, Value = val });
}

不过,这并不能完全消除重复项,因为公司名称可能与一个搜索词匹配,而标题可能与另一个搜索词匹配。但这会使列表有所减少。

多亏了吉姆的回答和我自己的努力,我成功地将完成搜索所需的时间减少了80%

以下是最终解决方案:

 //establish initial query
 var queryBase = (from a in db.Jobs where a.StatusId == 7 select a);

//instead of running the search against all of the entities, I first take the ones that are possible candidates, this is done through checking if they have any of the search terms under any of their columns. This is the one and only query that will be run against the database
if (search.Count > 0)
        {

            nquery = nquery.Where(job => search.All(y => (job.Title.ToLower() + " " + job.FullDescription.ToLower() + " " + job.Company.Name.ToLower() + " " + job.NormalLocation.ToLower() + " " + job.MainCategory.Name.ToLower() + " " + job.JobType.Type.ToLower()).Contains(y))); //  + " " + job.Location.ToLower() + " " + job.MainCategory.Name.ToLower() + " " + job.JobType.Type.ToLower().Contains(y)));
        }

        //run the query and grab a list of baseJobs
        List<Job> baseJobs = nquery.ToList<Job>();

        //A list of SearchResult object (these object act as a container for job ids       and their search values
        List<SearchResult> baseResult = new List<SearchResult>();

        //from here on Jim's algorithm comes to play where it assigns points depending on where the search term is located and added to a list of id/value pair list
        foreach (var a in baseJobs)
        {
            foreach (var item in search)
            {
                var itemLower = item.ToLower();

                if (a.Company.Name.ToLower().Contains(itemLower))
                    baseResult.Add(new SearchResult { ID = a.ID, Value = 5 });
                if (a.Title.ToLower().Contains(itemLower))
                    baseResult.Add(new SearchResult { ID = a.ID, Value = 3 });
                if (a.FullDescription.ToLower().Contains(itemLower))
                    baseResult.Add(new SearchResult { ID = a.ID, Value = 2 });
            }
        }

        List<SearchResult> result = baseResult.GroupBy(x => x.ID).Select(p => new SearchResult() { ID = p.First().ID, Value = p.Sum(i => i.Value) }).OrderByDescending(a => a.Value).ToList<SearchResult>();

        //the data generated through the id/value pair list are then used to reorder the initial jobs.
        var NewQuery = (from id in result join jbs in baseJobs on id.ID equals jbs.ID select jbs).AsQueryable();
//建立初始查询
var queryBase=(从db.Jobs中的a开始,其中a.StatusId==7选择a);
//我没有对所有实体进行搜索,而是首先选择可能的候选实体,这是通过检查它们的任何列下是否有任何搜索项来完成的。这是唯一一个将针对数据库运行的查询
如果(search.Count>0)
{
nquery=nquery.Where(job=>search.All(y=>(job.Title.ToLower()+“”+job.FullDescription.ToLower()+“”+job.Company.Name.ToLower()+“”+job.maincegory.Name.ToLower()+“”+job.JobType.Type.ToLower())。包含(y));//+“”+job.Location.ToLower()+“”+job.maincegory.Name.ToLower()+job.JobType.Type.ToLower().包含(y));
}
//运行查询并获取基本作业列表
List baseJobs=nquery.ToList();
//SearchResult对象的列表(这些对象充当作业ID及其搜索值的容器)
List baseResult=新列表();
//从这里开始,Jim的算法开始发挥作用,它根据搜索词的位置分配点,并添加到id/值对列表中
foreach(baseJobs中的var a)
{
foreach(搜索中的变量项)
{
var itemLower=item.ToLower();
如果(a.Company.Name.ToLower().包含(itemLower))
Add(新的搜索结果{ID=a.ID,Value=5});
如果(a.Title.ToLower().Contains(itemLower))
Add(新的搜索结果{ID=a.ID,Value=3});
if(a.FullDescription.ToLower().Contains(itemLower))
Add(新的搜索结果{ID=a.ID,Value=2});
}
}
List result=baseResult.GroupBy(x=>x.ID).Select(p=>newsearchresult(){ID=p.First().ID,Value=p.Sum(i=>i.Value)}).OrderByDescending(a=>a.Value).ToList();
//然后,通过id/值对列表生成的数据用于对初始作业重新排序。
var NewQuery=(从结果中的id在id.id上的baseJobs中加入jbs等于jbs.id选择jbs);

我怀疑您的主时间接收器正在对数据库进行三次不同的查询。您的数据是否足够小,您可以将整个数据集加载到内存中,然后进行内存查询?3000个条目,完整描述部分在1000到5000个字符之间。它不是超重,但是