C# 查询到列表的转换非常慢

C# 查询到列表的转换非常慢,c#,linq,C#,Linq,我从一个视图中查询,以检索指标列表。指示器的所有属性都可以从单个视图中检索 代码如下: data = new DataContext(); var core = from item in data.Items where countryIDs.Contains(item.CountryId) && indicatorIDs.Contains(item.IndicatorId) orderby item.Indic

我从一个视图中查询,以检索指标列表。指示器的所有属性都可以从单个视图中检索

代码如下:

data = new DataContext();

var core = from item in data.Items
           where countryIDs.Contains(item.CountryId)
           && indicatorIDs.Contains(item.IndicatorId)
           orderby item.Indicator
           select item;

var x = from item in core.Distinct()
        group item by new { item.IndicatorId, item.Indicator }
            into indicator
            select new
            {
                IndicatorID = indicator.Key.IndicatorId,
                IndicatorDescription = indicator.Key.Indicator,
                Genders = from g in core
                          where g.Gender != null
                          && g.IndicatorId == indicator.Key.IndicatorId
                          select new Gender
                          {
                              GenderID = g.GenderId,
                              GenderDescription = g.Gender
                          },
                HasGender = (from g in core
                             where g.Gender != null
                             && g.IndicatorId == indicator.Key.IndicatorId
                             select g.GenderId).Count() > 0,
                AreaTypes = from rat in core
                                     where rat.IndicatorId == indicator.Key.IndicatorId
                                     && rat.AreaType != null
                                     select new AreaType
                                     {
                                         AreaTypeId = rat.AreaTypeId,
                                         AreaDescription = rat.AreaType
                                     },
                HasAreaType = (from rat in core
                                        where rat.IndicatorId == indicator.Key.IndicatorId
                                        && rat.AreaType != null
                                        select rat.AreaTypeId).Count() > 0,
                Sectors = from s in core
                          where s.IndicatorId == indicator.Key.IndicatorId
                          && s.Sector != null
                          select new Sector
                          {
                              SectorID = s.SectorId,
                              Title = s.Sector
                          },
                HasSector = (from s in core
                             where s.IndicatorId == indicator.Key.IndicatorId
                             && s.Sector != null
                             select s.SectorId).Count() > 0
            };

List<Indicator> indicators = new List<Indicator>();
Indicator i = new Indicator();
foreach (var item in x)
{
    i = new Indicator()
    {
        IndicatorID = item.IndicatorID,
        IndicatorDescription = item.IndicatorDescription,
        Genders = item.Genders.ToList(),
        AreaTypes = item.AreaTypes.ToList(),
        Sectors = item.Sectors.ToList(),
        HasGender = item.HasGender,
        HasAreaType = item.HasAreaType,
        HasSector = item.HasSector
    };
    indicators.Add(i);
}
return indicators;
当x变换时,当它到达foreach循环时,速度会减慢。
有没有办法让这个查询更快地转换成列表?谢谢。

对于初学者,将“每个计数>0”更改为“任何方法”,计数将强制对正在查询的表进行完整扫描


如果这没有给您想要的性能增益,请尝试重写您的查询。我认为,如果您首先将数据投影到匿名类型,然后按该匿名类型分组,性能会更高

对于初学者,将every Count>0更改为Any方法,计数将强制对正在查询的表进行完整扫描


如果这没有给您想要的性能增益,请尝试重写您的查询。我认为,如果您首先将数据投影到匿名类型,然后按该匿名类型分组,性能会更高

查询中有一些where子句,例如where s.IndicatorId==indicator.Key.IndicatorId

在这里尝试使用连接语法,这将使它更快。 i、 e.您案例中的核心连接指示器。差不多

你的版本

from g in core 
where g.Gender != null && g.IndicatorId == indicator.Key.IndicatorId
会得到这样的东西

From g In core Join indi In indicator 
on g.IndicatorId Equals indi.Key.IndicatorId

您的查询中有一些where子句,例如where s.indicationId==indicator.Key.indicationId

在这里尝试使用连接语法,这将使它更快。 i、 e.您案例中的核心连接指示器。差不多

你的版本

from g in core 
where g.Gender != null && g.IndicatorId == indicator.Key.IndicatorId
会得到这样的东西

From g In core Join indi In indicator 
on g.IndicatorId Equals indi.Key.IndicatorId

看起来您正在进行大量不必要的嵌套查询

您的核心查询是在返回项目之前进行一些相对昂贵的过滤和排序。最好只执行此查询一次

但是,您将在此查询上执行六个不必要的连接

例如,您的查询性别是重新查询核心,只保留与您已经分组的指示符相同的项目!如果我可以假设item.Indicator是item.Indicator的一对一,那么您的组指标已经包含此子集

您正在以相同的方式查询区域类型和扇区

现在,HasGender、HasAreaType和HasSector中的每一个都重复上述查询,并强制执行一个。仅对它们进行计数以检查值是否大于零。这是一种浪费,因为任何人都会检查至少一个值,这对你来说要便宜得多

现在,为了测试访问核心查询的次数,我创建了以下测试代码:

var countryIDs = Enumerable.Range(0, 100).ToArray();
var indicatorIDs = Enumerable.Range(0, 100).ToArray();

data.Items.AddRange(
    Enumerable
        .Range(0, 100)
        .Select(n =>
            new Item()
            {
                CountryId = n,
                IndicatorId = n,
                Indicator = "Indicator",
                GenderId = n,
                Gender = "Gender",
                AreaTypeId = n,
                AreaType = "Area",
                SectorId = n,
                Sector = "Sector",
            }));
我对core进行了如下修改:

var counter = 0;
var core =
    (from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item).Do(_ => counter++);
var query =
    from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item;

var core = query.ToArray();
var x =
    from item in core.Distinct()
    group item by new
    {
        item.IndicatorId,
        item.Indicator
    } into indicator
    let Genders = (
        from g in indicator
        where g.Gender != null
        select new Gender
        {
            GenderID = g.GenderId,
            GenderDescription = g.Gender,
        }).ToList()
    let AreaTypes = (
        from rat in indicator
        where rat.AreaType != null
        select new AreaType
        {
            AreaTypeId = rat.AreaTypeId,
            AreaDescription = rat.AreaType,
        }).ToList()
    let Sectors = (
        from s in indicator
        where s.Sector != null
        select new Sector
        {
            SectorID = s.SectorId,
            Title = s.Sector,
        }).ToList()
    select new Indicator()
    {
        IndicatorID = indicator.Key.IndicatorId,
        IndicatorDescription = indicator.Key.Indicator,
        Genders = Genders,
        AreaTypes = AreaTypes,
        Sectors = Sectors,
        HasGender = Genders.Any(),
        HasAreaType = AreaTypes.Any(),
        HasSector = Sectors.Any(),
    };
Do运算符来自被动扩展系统。交互式程序集

运行代码时,我得到了以下结果:

counter == 60100
因为我已经在集合中放入了100项,这告诉我您的查询正在调用新的核心执行601次

这可以很容易地更改为执行一次core

首先,我对core进行了如下修改:

var counter = 0;
var core =
    (from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item).Do(_ => counter++);
var query =
    from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item;

var core = query.ToArray();
var x =
    from item in core.Distinct()
    group item by new
    {
        item.IndicatorId,
        item.Indicator
    } into indicator
    let Genders = (
        from g in indicator
        where g.Gender != null
        select new Gender
        {
            GenderID = g.GenderId,
            GenderDescription = g.Gender,
        }).ToList()
    let AreaTypes = (
        from rat in indicator
        where rat.AreaType != null
        select new AreaType
        {
            AreaTypeId = rat.AreaTypeId,
            AreaDescription = rat.AreaType,
        }).ToList()
    let Sectors = (
        from s in indicator
        where s.Sector != null
        select new Sector
        {
            SectorID = s.SectorId,
            Title = s.Sector,
        }).ToList()
    select new Indicator()
    {
        IndicatorID = indicator.Key.IndicatorId,
        IndicatorDescription = indicator.Key.Indicator,
        Genders = Genders,
        AreaTypes = AreaTypes,
        Sectors = Sectors,
        HasGender = Genders.Any(),
        HasAreaType = AreaTypes.Any(),
        HasSector = Sectors.Any(),
    };
.ToArray将查询结果带入内存

然后将x查询修改为如下所示:

var counter = 0;
var core =
    (from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item).Do(_ => counter++);
var query =
    from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item;

var core = query.ToArray();
var x =
    from item in core.Distinct()
    group item by new
    {
        item.IndicatorId,
        item.Indicator
    } into indicator
    let Genders = (
        from g in indicator
        where g.Gender != null
        select new Gender
        {
            GenderID = g.GenderId,
            GenderDescription = g.Gender,
        }).ToList()
    let AreaTypes = (
        from rat in indicator
        where rat.AreaType != null
        select new AreaType
        {
            AreaTypeId = rat.AreaTypeId,
            AreaDescription = rat.AreaType,
        }).ToList()
    let Sectors = (
        from s in indicator
        where s.Sector != null
        select new Sector
        {
            SectorID = s.SectorId,
            Title = s.Sector,
        }).ToList()
    select new Indicator()
    {
        IndicatorID = indicator.Key.IndicatorId,
        IndicatorDescription = indicator.Key.Indicator,
        Genders = Genders,
        AreaTypes = AreaTypes,
        Sectors = Sectors,
        HasGender = Genders.Any(),
        HasAreaType = AreaTypes.Any(),
        HasSector = Sectors.Any(),
    };
请注意,我只计算了一次性别、区域类型和扇区,并将它们创建为一个列表。这允许我更改x以立即生成指示器的实例

现在,指标列表的最终创建非常简单:

var indicators = x.ToList();
当我在这个方法中使用我的样本数据时,我的结果是:

counter == 100
这意味着此查询只命中原始核心查询一次

然后,当我将原始样本数据增加到1000个项目时,我检查了嵌套的行为——我用新代码得到了一次命中,用原始代码得到了6001次命中——并且速度慢了很多

请记住,LINQ是延迟计算的,所以执行不会发生在定义查询的地方,而是在执行查询的地方


因此,这里的建议是,如果内存允许,您应该尽快执行查询,将数据放入内存,然后执行一次且仅执行一次计算。

看起来您正在执行大量不必要的嵌套查询

您的核心查询是在返回项目之前进行一些相对昂贵的过滤和排序。最好只执行此查询一次

但是,您将在此查询上执行六个不必要的连接

例如,您的查询性别是重新查询核心,只保留与您已经分组的指示符相同的项目!如果我可以假设item.Indicator是item.Indicator的一对一,那么您的组指标已经包含此子集

您正在以相同的方式查询区域类型和扇区

现在,HasGender、HasAreaType和HasSector中的每一个都重复上述查询,并强制执行一个。仅对它们进行计数以检查值是否大于零。这是一种浪费,因为任何人都会检查至少一个值,这对你来说要便宜得多

现在要测试多少次 正在访问核心查询。我创建了以下测试代码:

var countryIDs = Enumerable.Range(0, 100).ToArray();
var indicatorIDs = Enumerable.Range(0, 100).ToArray();

data.Items.AddRange(
    Enumerable
        .Range(0, 100)
        .Select(n =>
            new Item()
            {
                CountryId = n,
                IndicatorId = n,
                Indicator = "Indicator",
                GenderId = n,
                Gender = "Gender",
                AreaTypeId = n,
                AreaType = "Area",
                SectorId = n,
                Sector = "Sector",
            }));
我对core进行了如下修改:

var counter = 0;
var core =
    (from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item).Do(_ => counter++);
var query =
    from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item;

var core = query.ToArray();
var x =
    from item in core.Distinct()
    group item by new
    {
        item.IndicatorId,
        item.Indicator
    } into indicator
    let Genders = (
        from g in indicator
        where g.Gender != null
        select new Gender
        {
            GenderID = g.GenderId,
            GenderDescription = g.Gender,
        }).ToList()
    let AreaTypes = (
        from rat in indicator
        where rat.AreaType != null
        select new AreaType
        {
            AreaTypeId = rat.AreaTypeId,
            AreaDescription = rat.AreaType,
        }).ToList()
    let Sectors = (
        from s in indicator
        where s.Sector != null
        select new Sector
        {
            SectorID = s.SectorId,
            Title = s.Sector,
        }).ToList()
    select new Indicator()
    {
        IndicatorID = indicator.Key.IndicatorId,
        IndicatorDescription = indicator.Key.Indicator,
        Genders = Genders,
        AreaTypes = AreaTypes,
        Sectors = Sectors,
        HasGender = Genders.Any(),
        HasAreaType = AreaTypes.Any(),
        HasSector = Sectors.Any(),
    };
Do运算符来自被动扩展系统。交互式程序集

运行代码时,我得到了以下结果:

counter == 60100
因为我已经在集合中放入了100项,这告诉我您的查询正在调用新的核心执行601次

这可以很容易地更改为执行一次core

首先,我对core进行了如下修改:

var counter = 0;
var core =
    (from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item).Do(_ => counter++);
var query =
    from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item;

var core = query.ToArray();
var x =
    from item in core.Distinct()
    group item by new
    {
        item.IndicatorId,
        item.Indicator
    } into indicator
    let Genders = (
        from g in indicator
        where g.Gender != null
        select new Gender
        {
            GenderID = g.GenderId,
            GenderDescription = g.Gender,
        }).ToList()
    let AreaTypes = (
        from rat in indicator
        where rat.AreaType != null
        select new AreaType
        {
            AreaTypeId = rat.AreaTypeId,
            AreaDescription = rat.AreaType,
        }).ToList()
    let Sectors = (
        from s in indicator
        where s.Sector != null
        select new Sector
        {
            SectorID = s.SectorId,
            Title = s.Sector,
        }).ToList()
    select new Indicator()
    {
        IndicatorID = indicator.Key.IndicatorId,
        IndicatorDescription = indicator.Key.Indicator,
        Genders = Genders,
        AreaTypes = AreaTypes,
        Sectors = Sectors,
        HasGender = Genders.Any(),
        HasAreaType = AreaTypes.Any(),
        HasSector = Sectors.Any(),
    };
.ToArray将查询结果带入内存

然后将x查询修改为如下所示:

var counter = 0;
var core =
    (from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item).Do(_ => counter++);
var query =
    from item in data.Items
    where countryIDs.Contains(item.CountryId)
        && indicatorIDs.Contains(item.IndicatorId)
    orderby item.Indicator
    select item;

var core = query.ToArray();
var x =
    from item in core.Distinct()
    group item by new
    {
        item.IndicatorId,
        item.Indicator
    } into indicator
    let Genders = (
        from g in indicator
        where g.Gender != null
        select new Gender
        {
            GenderID = g.GenderId,
            GenderDescription = g.Gender,
        }).ToList()
    let AreaTypes = (
        from rat in indicator
        where rat.AreaType != null
        select new AreaType
        {
            AreaTypeId = rat.AreaTypeId,
            AreaDescription = rat.AreaType,
        }).ToList()
    let Sectors = (
        from s in indicator
        where s.Sector != null
        select new Sector
        {
            SectorID = s.SectorId,
            Title = s.Sector,
        }).ToList()
    select new Indicator()
    {
        IndicatorID = indicator.Key.IndicatorId,
        IndicatorDescription = indicator.Key.Indicator,
        Genders = Genders,
        AreaTypes = AreaTypes,
        Sectors = Sectors,
        HasGender = Genders.Any(),
        HasAreaType = AreaTypes.Any(),
        HasSector = Sectors.Any(),
    };
请注意,我只计算了一次性别、区域类型和扇区,并将它们创建为一个列表。这允许我更改x以立即生成指示器的实例

现在,指标列表的最终创建非常简单:

var indicators = x.ToList();
当我在这个方法中使用我的样本数据时,我的结果是:

counter == 100
这意味着此查询只命中原始核心查询一次

然后,当我将原始样本数据增加到1000个项目时,我检查了嵌套的行为——我用新代码得到了一次命中,用原始代码得到了6001次命中——并且速度慢了很多

请记住,LINQ是延迟计算的,所以执行不会发生在定义查询的地方,而是在执行查询的地方


因此,这里的建议是,如果内存允许,您应该尽快执行查询,将数据放入内存,然后只执行一次计算。

嘿,玛丽。我尝试使用join而不是where,但速度较慢。连接的处理时间为6750ms,连接的处理时间为1765.6476ms。您可以提供临清的数据集有多大?你的加入linq看起来怎么样?嘿,mare。我尝试使用join而不是where,但速度较慢。连接的处理时间为6750ms,连接的处理时间为1765.6476ms。您可以提供临清的数据集有多大?你的加入linq看起来怎么样?我很惊讶!这可能是我读过的最专业的答案了+1.我仍然对你的回答充满敬畏。我很敬畏!这可能是我读过的最专业的答案了+1.我仍然对你的答案充满敬畏。