C#：在长时间运行的SQL读取器循环中强制干净运行？_C#_Sql_Garbage Collection_Sqldatareader

C#：在长时间运行的SQL读取器循环中强制干净运行？

c# sql

C#：在长时间运行的SQL读取器循环中强制干净运行？,c#,sql,garbage-collection,sqldatareader,C#,Sql,Garbage Collection,Sqldatareader,我有一个SQL数据读取器，可以从SQL db表中读取2列。一旦完成了它的任务，它就会再次开始选择另外两列我会一次完成全部任务，但这会带来另一系列挑战我的问题是，该表包含大量数据（大约300万行左右），这使得处理整个集合有点困难我试图验证字段值，所以我先拉ID列，然后拉其他列中的一个列，然后通过验证管道运行列中的每个值，结果存储在另一个数据库中我的问题是，当读卡器到达handlin一列的末尾时，我需要强制它立即清理使用的每个小内存块，因为这个过程使用了大约700MB的内存，并且它有大约2

我有一个SQL数据读取器，可以从SQL db表中读取2列。一旦完成了它的任务，它就会再次开始选择另外两列

我会一次完成全部任务，但这会带来另一系列挑战

我的问题是，该表包含大量数据（大约300万行左右），这使得处理整个集合有点困难

我试图验证字段值，所以我先拉ID列，然后拉其他列中的一个列，然后通过验证管道运行列中的每个值，结果存储在另一个数据库中

我的问题是，当读卡器到达handlin一列的末尾时，我需要强制它立即清理使用的每个小内存块，因为这个过程使用了大约700MB的内存，并且它有大约200列要通过

如果没有一个完整的垃圾收集，我肯定会耗尽内存

有人知道我该怎么做吗

我使用了很多小的可重用对象，我的想法是，我可以在每个读取周期结束时调用GC.Collect（），这将清除所有内容，不幸的是，由于某种原因，这种情况没有发生

好的，我希望这是合适的，但这里的方法有问题

void AnalyseTable(string ObjectName, string TableName)
{
    Console.WriteLine("Initialising analysis process for SF object \"" + ObjectName + "\"");
    Console.WriteLine("   The data being used is in table [" + TableName + "]");
    // get some helpful stuff from the databases
    SQLcols = Target.GetData("SELECT Column_Name, Is_Nullable, Data_Type, Character_Maximum_Length FROM information_schema.columns WHERE table_name = '" + TableName + "'");
    SFcols = SchemaSource.GetData("SELECT * FROM [" + ObjectName + "Fields]");
    PickLists = SchemaSource.GetData("SELECT * FROM [" + ObjectName + "PickLists]");

    // get the table definition
    DataTable resultBatch = new DataTable();
    resultBatch.TableName = TableName;
    int counter = 0;

    foreach (DataRow Column in SQLcols.Rows)
    {
        if (Column["Column_Name"].ToString().ToLower() != "id")
            resultBatch.Columns.Add(new DataColumn(Column["Column_Name"].ToString(), typeof(bool)));
        else
            resultBatch.Columns.Add(new DataColumn("ID", typeof(string)));
    }
    // create the validation results table
    //SchemaSource.CreateTable(resultBatch, "ValidationResults_");
    // cache the id's from the source table in the validation table
    //CacheIDColumn(TableName);

    // validate the source table
    // iterate through each sql column
    foreach (DataRow Column in SQLcols.Rows)
    {
        // we do this here to save making this call a lot more later
        string colName = Column["Column_Name"].ToString().ToLower();
        // id col is only used to identify records not in validation
        if (colName != "id")
        {
            // prepare to process
            counter = 0;
            resultBatch.Rows.Clear();
            resultBatch.Columns.Clear();
            resultBatch.Columns.Add(new DataColumn("ID", typeof(string)));
            resultBatch.Columns.Add(new DataColumn(colName, typeof(bool)));

            // identify matching SF col
            foreach (DataRow SFDefinition in SFcols.Rows)
            {
                // case insensitive compare on the col name to ensure we have a match ...
                if (SFDefinition["Name"].ToString().ToLower() == colName)
                {
                    // select the id column and the column data to validate (current column data)
                    using (SqlCommand com = new SqlCommand("SELECT ID, [" + colName + "] FROM [" + TableName + "]", new SqlConnection(ConfigurationManager.ConnectionStrings["AnalysisTarget"].ConnectionString)))
                    {
                        com.Connection.Open();
                        SqlDataReader reader = com.ExecuteReader();

                        Console.WriteLine("   Validating column \"" + colName + "\"");
                        // foreach row in the given object dataset 
                        while (reader.Read())
                        {
                            // create a new validation result row
                            DataRow result = resultBatch.NewRow();
                            bool hasFailed = false;
                            // validate it
                            object vResult = ValidateFieldValue(SFDefinition, reader[Column["Column_Name"].ToString()]);
                            // if we have the relevant col definition lets decide how to validate this value ...
                            result[colName] = vResult;

                            if (vResult is bool)
                            {
                                // if it's deemed to have failed validation mark it as such
                                if (!(bool)vResult)
                                    hasFailed = true;
                            }

                            // no point in adding rows we can't trace
                            if (reader["id"] != DBNull.Value && reader["id"] != null)
                            {
                                // add the failed row to the result set
                                if (hasFailed)
                                {
                                    result["id"] = reader["id"];
                                    resultBatch.Rows.Add(result);
                                }
                            }

                            // submit to db in batches of 200
                            if (resultBatch.Rows.Count > 199)
                            {
                                counter += resultBatch.Rows.Count;
                                Console.Write("   Result batch completed,");
                                SchemaSource.Update(resultBatch, "ValidationResults_");
                                Console.WriteLine("      committed " + counter.ToString() + " fails to the database so far.");
                                Console.SetCursorPosition(0, Console.CursorTop-1);
                                resultBatch.Rows.Clear();
                            }
                        }
                        // get rid of these likely very heavy objects
                        reader.Close();
                        reader.Dispose();
                        com.Connection.Close();
                        com.Dispose();
                        // ensure .Net does a full cleanup because we will need the resources.
                        GC.Collect();

                        if (resultBatch.Rows.Count > 0)
                        {
                            counter += resultBatch.Rows.Count;
                            Console.WriteLine("   All batches for column complete,");
                            SchemaSource.Update(resultBatch, "ValidationResults_");
                            Console.WriteLine("      committed " + counter.ToString() + " fails to the database.");
                        }
                    }
                }
            }
        }

        Console.WriteLine("   Completed processing column \"" + colName + "\"");
        Console.WriteLine("");
    }

    Console.WriteLine("Object processing complete.");
}

，它可能会给你所需要的行为。另外，假设这是一个斑点，你也可以分块阅读

为DataReader提供一种处理包含具有大二进制值的列的行的方法。SequentialAccess使DataReader能够以流的形式加载数据，而不是加载整行。然后，可以使用GetBytes或GetChars方法指定开始读取操作的字节位置，以及返回数据的有限缓冲区大小

当您指定SequentialAccess时，您需要按照返回的顺序读取列，尽管您不需要读取每一列。一旦您读取了返回数据流中的某个位置，则无法再从DataReader读取该位置或该位置之前的数据。使用OleDbDataReader时，可以重新读取当前列值，直到读取超过它为止。使用SqlDataReader时，只能读取一次列值

你能发一些密码吗。NET的数据读取器应该是一个“消防水龙带”，它对RAM很吝啬，除非，正如Freddy所建议的，您的列数据值很大。此验证+DB写入需要多长时间

一般来说，如果需要并且能够完成GC，那么它就会完成。我可能听起来像是一个坏记录，但如果必须使用GC.Collect（）的话，则会出现其他问题。

为什么不能在SQL Server中运行验证？验证过程涉及一些复杂的规则，这些规则无法在SQL中实现。可以根据其他数据库中的值设置值，甚至可能是在其他网络上。这似乎不是一个容易解决的问题。。。现在，我在一台内存更多的机器上运行这段代码，我确实得到了一些清理，但它仍然将我的内存使用率推到了4.5GB，而且我还没有在一个大表上尝试过它（得到了一个包含大约9000万条记录的表）。+1我同意，在这种情况下，强制垃圾收集似乎是完全错误的。此外，如果没有顺序访问，读卡器将保留OP打算释放的实例的内部引用。我对单个字段中的数据大小没有问题，只是需要读取的字段数量太多。