C# 使用c从SQL Server读取数据需要花费大量时间_C#_Sql_Sql Server 2008_Sql Server 2012

C# 使用c从SQL Server读取数据需要花费大量时间

c# sql sql-server-2008 sql-server-2012

C# 使用c从SQL Server读取数据需要花费大量时间,c#,sql,sql-server-2008,sql-server-2012,C#,Sql,Sql Server 2008,Sql Server 2012,我在一张表中有400万行。大小约为300 GB，我想从sql server数据库读取表中的所有行。我在C中使用了以下代码。这需要时间。请给我一些改进建议 List<int> hit_block_index = new List<int>(); /* Here i process some other and populate hit_block_index with integers */ str

我在一张表中有400万行。大小约为300 GB，我想从sql server数据库读取表中的所有行。我在C中使用了以下代码。这需要时间。请给我一些改进建议

            List<int> hit_block_index = new List<int>();

            /* Here i process some other and populate hit_block_index with integers */

            string _query = "SELECT TraceName,BlockVector FROM Trace";

            SqlConnection _connection = new SqlConnection(connection_string);

            _connection.Open();

            SqlCommand _command = new SqlCommand(_query, _connection);

            SqlDataReader data_reader = _command.ExecuteReader();

            Byte[] block_vector=null;

            string trace_name = null;

            BitArray trace = null;

            int max_coverage = 0;

            while (data_reader.Read())
            {
                  int coverage = 0;

                  block_vector = (byte[])data_reader["BlockVector"];

                  trace_name = (string)data_reader["TraceName"];

                  BitArray trace = new BitArray(block_vector);

                  foreach (int x in hit_blocks_index)
                  {
                       if (trace[x])
                       {
                           coverage++;
                       }
                  }

                  Console.WriteLine("hit count is:" + coverage);

                  if (coverage > max_coverage)
                  {
                         most_covered_trace = trace_name;
                         most_covered_array = trace;
                         max_coverage = coverage;
                  }
           }

像这样的东西可能有用。我还不确定效率-这可能取决于您正在寻找的点击量：

create type HitBlocks as table (
    HitIndex int not null
)
go
create procedure FindMaxCover
    @Hits HitBlocks readonly
as
    ;With DecomposedBlocks as (
        select (HitIndex/8)+1 as ByteIndex,POWER(2,(HitIndex%8)) as BitMask
        from @Hits
    ), Coverage as (
        select
            t.TraceName,SUM(CASE WHEN SUBSTRING(t.BlockVector,db.ByteIndex,1) & BitMask != 0 THEN 1 ELSE 0 END) as Coverage
        from
            Trace t
                cross join
            DecomposedBlocks db
        group by
            t.TraceName
    ), Ranked as (
        select *,RANK() OVER (ORDER BY Coverage desc) as rk
        from Coverage
    )
    select
        t.TraceName,
        t.BlockVector,
        r.Coverage
    from
        Ranked r
            inner join
        Trace t
            on
                r.TraceName = t.TraceName
    where rk = 1

目前，如果有多个结果具有相同的覆盖率级别，则将返回多行。您可能还需要调整我的期望值和您的期望值之间的a一些off-by-one错误，以及b在计算正确的位掩码值时可能存在一些endianness问题

从代码中，您可以使用当前存储在hit_block_索引中的值填充数据表，并将其作为@Hits传递。

如果您确实需要读取所有数据。。。通过StoredProcess或引擎允许的任何方式将代码放入数据库。完全传输数据库是没有意义的

除此之外，你真的应该考虑选择另一个策略。示例1：可以在插入时创建触发器。插入值时，可以在可能的情况下重新计算覆盖率，而无需读取所有数据

示例2：您可以使用SQL Azure Federations或Azure Worker角色来扩展您的问题。

300 GB应该需要时间，如果它不能改善1：更低的期望值，我会担心。即使在没有传输开销的千兆链路上300 GB的原始数据仍然是300噢，在这种情况下，在BlockVector列上运行ProcessByteArray SQL函数。-如果您需要帮助，您需要具体告诉我们您对数据所做的操作。甚至没有任何where条件使数据得到过滤，因此我认为3小时非常适合您的场景：