Javascript 使用次线性时间搜索100万对
我有一个JSON对象,它包含一百万对Javascript 使用次线性时间搜索100万对,javascript,c#,jquery,.net,json,Javascript,C#,Jquery,.net,Json,我有一个JSON对象,它包含一百万对 var student = {[ { name: "govi", score: "65" }, { name: "dharti", score: "80" }, { name: "Akash", score: "75" },............. till 1 million ] }; 现在我的关注点如
var student = {[
{
name: "govi",
score: "65"
},
{
name: "dharti",
score: "80"
},
{
name: "Akash",
score: "75"
},............. till 1 million
]
};
现在我的关注点如下
我想构建一个服务器程序,它接受一个用户查询,这样对于每个查询,它都会响应以s开头或包含“_s”的分数排名前10名的名称,例如,revenue和Year_revenue都匹配前缀rev。使用普通的Jquery和json程序太容易了,但有一个条件
状况
根据输入中的名称数量,查询应答应以次线性时间运行。1从nuget rep添加Newtonsoft库 2添加以下参考资料
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
3.使用此代码
//JObjectString is your string that contains the values
JArray ValuesArray = JArray.Parse(JObjectString);
Dictionary<string, int> SearchDict = new Dictionary<string, int>();
//here the search term is the query from the user input
string searchTerm = "govi";
foreach (var rec in ValuesArray)
{
SearchDict.Add(rec["name"].ToString(), Int32.Parse(rec["score"].ToString()));
}
//here is the result in javascript array format, return it
string ResultString = JsonConvert.SerializeObject(SearchDict.Where(o => o.Key == searchTerm | o.Key.Contains(searchTerm)).
Select(o => o).OrderByDescending(o => o.Value).Take(10).Select(o => o));
按分数降序排列数组。这可以确保下一步创建已排序的子集,并且比排序通常重叠的子集更快
按字符创建字典。检查每个名称的每个字符,并将该项添加到子集中
当搜索像rev这样的子字符串时,您可以检查每个字母的数组,取最短的字母,并可以遍历该字母,而不是整个数据
例如:
//building the map
let map = Object.create(null);
data
//.slice() /if you can't rearrange the items in the original Array
.sort((a,b) => b.score - a.score)
.forEach(item => {
for(let char of new Set(item.name)){
var arr = map[char] ||| (map[char] = []);
arr.push(item);
}
});
function findSubset(str){
let best;
for(let char of new Set(str)){
//there can be no entry for "rev", if there is no subset for "v" for example
if(!(char in map)) return [];
let arr = map[char];
if(!best || arr.length < best.length)
best = arr;
}
return best || [];
}
function contains(str, limit=Infinity){
let subset = findSubset(str);
let results = [];
for(let i=0; i<subset.length && results.length < limit; ++i){
let item = subset[i];
if(item.name.includes(str))
results.push(item);
}
return results;
}
调用StudentIndex.GetTop10string s将在上,其中n是字符串s的长度。这将消耗大量内存,但有许多方法可以减少内存消耗。例如,对Node.Top10进行一次线性搜索,其值小于10 我让你写这棵树
public class Student
{
public string Name { get; set; }
public double Score { get; set; }
}
public class Node
{
public Dictionary<char, Node> Children { get; set; }
public List<Student> Top10 { get; set; }
}
public class StudentIndex
{
private Node _root;
public StudentIndex(IEnumerable<Student> students)
{
Node root = new Node();
foreach(var student in students)
{
var parts = student.Name.Split(new[] {'_'});
foreach(var part in parts)
{
//you'll add each student to the tree using each part of the name
}
}
//set _root
}
public IEnumerable<Student> GetTop10(string s)
{
return GetTop10(s.ToLower(), _root);
}
private IEnumerable<Student> GetTop10(string s, Node node)
{
if (node.Children == null) return node.Top10;
if (s.Length == 0) return node.Top10;
var c = s[0];
Node n;
if (node.Children.TryGetValue(c, out n))
{
return GetTop10(s.Substring(1), n);
}
else
{
return Enumerable.Empty<Student>();
}
}
}
按姓名搜索还是按分数搜索?您需要按最可能的搜索模式对数组进行索引您希望使用多少内存?@gurvinder372按分数排序,按搜索name@MineR内存不是问题,只是想用关于次线性时间的算法来实现它。下来投什么票?它解决了我的问题。