Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/sql-server/24.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql 批量插入格式文件中的正确排序规则_Sql_Sql Server_Csv_Bulkinsert - Fatal编程技术网

Sql 批量插入格式文件中的正确排序规则

Sql 批量插入格式文件中的正确排序规则,sql,sql-server,csv,bulkinsert,Sql,Sql Server,Csv,Bulkinsert,我正在尝试使用带格式文件的大容量插入将.CSV文件导入SQL Server表。我可以让它导入,但任何拉丁字符都是作为奇怪字符导入的。我为自己完成这个个人项目感到非常自豪,但我已经到了需要帮助的地步。导入数据后,我可以通过执行一些凌乱的UPDATE和REPLACE语句来更改字符,但我确实希望能够一步导入.CSV文件中显示的拉丁字符。 以下是我创建的数据库和表: CREATE DATABASE Test; CREATE TABLE dbo.rawData ([Position] nvarc

我正在尝试使用带格式文件的大容量插入将.CSV文件导入SQL Server表。我可以让它导入,但任何拉丁字符都是作为奇怪字符导入的。我为自己完成这个个人项目感到非常自豪,但我已经到了需要帮助的地步。导入数据后,我可以通过执行一些凌乱的UPDATE和REPLACE语句来更改字符,但我确实希望能够一步导入.CSV文件中显示的拉丁字符。 以下是我创建的数据库和表:

CREATE DATABASE Test;

CREATE TABLE dbo.rawData
    ([Position] nvarchar(500) NULL,
    [Const] nvarchar(500) NULL,
    [Created] nvarchar(500) NULL,
    [Modified] nvarchar(500) NULL,
    [Description] nvarchar(500) NULL,
    [Title] nvarchar(500) NOT NULL,
    [TitleType] nvarchar(500) NULL,
    [Directors] nvarchar(500) NULL,
    [YouRated] nvarchar(500) NULL,
    [IMDbRating] nvarchar(500) NULL,
    [Runtime] nvarchar(500) NULL,
    [Year] nvarchar(500) NULL,
    [Genres] nvarchar(500) NULL,
    [NumVotes] nvarchar(500) NULL,
    [ReleaseDate] nvarchar(500) NULL, 
    [URL] nvarchar(500) NULL,
    )
GO
下面是我正在处理的一些数据,它们来自.CSV文件(另存为ratings.CSV)。我使用记事本+,它是用UTF-8编码的。请注意,“达拉斯买家俱乐部”的最后一行是如何有一个名字中带有拉丁字符的董事的:

"position","const","created","modified","description","Title","Title type","Directors","You rated","IMDb Rating","Runtime (mins)","Year","Genres","Num. Votes","Release Date (month/day/year)","URL"
"1","tt0437863","Tue Feb 16 00:00:00 2016","","","The Benchwarmers","Feature Film","Dennis Dugan","5","5.6","80","2006","comedy, romance, sport","39413","2006-04-07","http://www.imdb.com/title/tt0437863/"
"2","tt0085334","Tue Feb 16 00:00:00 2016","","","A Christmas Story","Feature Film","Bob Clark","6","8.1","94","1983","comedy, family","103770","1983-11-18","http://www.imdb.com/title/tt0085334/"
"3","tt2403029","Tue Feb 16 00:00:00 2016","","","The Starving Games","Feature Film","Jason Friedberg, Aaron Seltzer","2","3.3","83","2013","comedy","13719","2013-10-31","http://www.imdb.com/title/tt2403029/"
"4","tt0316465","Tue Feb 16 00:00:00 2016","","","Radio","Feature Film","Michael Tollin","6","6.9","109","2003","biography, drama, sport","31692","2003-10-24","http://www.imdb.com/title/tt0316465/"
"5","tt0141369","Tue Feb 16 00:00:00 2016","","","Inspector Gadget","Feature Film","David Kellogg","4","4.1","78","1999","action, adventure, comedy, family, sci_fi","35340","1999-07-18","http://www.imdb.com/title/tt0141369/"
"6","tt0033563","Tue Feb 16 00:00:00 2016","","","Dumbo","Feature Film","Sam Armstrong, Norman Ferguson","6","7.3","64","1941","animation, family, musical","80737","1941-10-23","http://www.imdb.com/title/tt0033563/"
"7","tt0384642","Tue Feb 16 00:00:00 2016","","","Kicking & Screaming","Feature Film","Jesse Dylan","5","5.5","95","2005","comedy, family, romance, sport","29539","2005-05-01","http://www.imdb.com/title/tt0384642/"
"8","tt0116705","Tue Feb 16 00:00:00 2016","","","Jingle All the Way","Feature Film","Brian Levant","7","5.4","89","1996","comedy, family","66879","1996-11-16","http://www.imdb.com/title/tt0116705/"
"9","tt1981677","Tue Feb 16 00:00:00 2016","","","Pitch Perfect","Feature Film","Jason Moore","7","7.2","112","2012","comedy, music, romance","203205","2012-09-28","http://www.imdb.com/title/tt1981677/"
"10","tt0409459","Tue Feb 16 00:00:00 2016","","","Watchmen","Feature Film","Zack Snyder","7","7.6","162","2009","action, mystery, sci_fi","368137","2009-02-23","http://www.imdb.com/title/tt0409459/"
"11","tt1343092","Tue Feb 16 00:00:00 2016","","","The Great Gatsby","Feature Film","Baz Luhrmann","5","7.3","143","2013","drama, romance","345664","2013-05-01","http://www.imdb.com/title/tt1343092/"
"12","tt0332379","Tue Feb 16 00:00:00 2016","","","School of Rock","Feature Film","Richard Linklater","5","7.1","108","2003","comedy, music","202083","2003-09-09","http://www.imdb.com/title/tt0332379/"
"13","tt0120783","Tue Feb 16 00:00:00 2016","","","The Parent Trap","Feature Film","Nancy Meyers","6","6.4","128","1998","adventure, comedy, drama, family, romance","82087","1998-07-20","http://www.imdb.com/title/tt0120783/"
"14","tt0790636","Tue Feb 16 00:00:00 2016","","","Dallas Buyers Club","Feature Film","Jean-Marc Vallée","7","8.0","117","2013","biography, drama","308118","2013-09-07","http://www.imdb.com/title/tt0790636/"
我有一个格式文件(另存为format.fmt),在Notepad++中打开时如下所示:

11.0
16
1       SQLCHAR             0       1000    "\",\""    1     Position                   SQL_Latin1_General_CP1_CI_AS
2       SQLCHAR             0       1000    "\",\""    2     Const                      SQL_Latin1_General_CP1_CI_AS
3       SQLCHAR             0       1000    "\",\""    3     Created                    SQL_Latin1_General_CP1_CI_AS
4       SQLCHAR             0       1000    "\",\""    4     Modified                   SQL_Latin1_General_CP1_CI_AS
5       SQLCHAR             0       1000    "\",\""    5     Description                SQL_Latin1_General_CP1_CI_AS
6       SQLCHAR             0       1000    "\",\""    6     Title                      SQL_Latin1_General_CP1_CI_AS
7       SQLCHAR             0       1000    "\",\""    7     TitleType                  SQL_Latin1_General_CP1_CI_AS
8       SQLCHAR             0       1000    "\",\""    8     Directors                  SQL_Latin1_General_CP1_CI_AS
9       SQLCHAR             0       1000    "\",\""    9     YouRated                   SQL_Latin1_General_CP1_CI_AS
10      SQLCHAR             0       1000    "\",\""    10    IMDbRating                 SQL_Latin1_General_CP1_CI_AS
11      SQLCHAR             0       1000    "\",\""    11    Runtime                    SQL_Latin1_General_CP1_CI_AS
12      SQLCHAR             0       1000    "\",\""    12    Year                       SQL_Latin1_General_CP1_CI_AS
13      SQLCHAR             0       1000    "\",\""    13    Genres                     SQL_Latin1_General_CP1_CI_AS
14      SQLCHAR             0       1000    "\",\""    14    NumVotes                   SQL_Latin1_General_CP1_CI_AS
15      SQLCHAR             0       1000    "\",\""    15    ReleaseDate                SQL_Latin1_General_CP1_CI_AS
16      SQLCHAR             0       1000    "\""     16    URL                        SQL_Latin1_General_CP1_CI_AS
当我运行下面的代码时,所有内容都导入,但是拉丁字符被一系列奇怪的字符替换。以下是我正在运行的代码:

BULK INSERT [Test].[dbo].[rawData]
FROM 'C:\IMDbRatings\Files\ratings.csv' WITH (FIRSTROW = 2, FORMATFILE= 'C:\IMDbRatings\format.fmt');

我做了一些尝试,将.CSV文件更改为UCS-2 BE,在大容量插入的WITH子句中添加了不同的条件,并将格式文件中的变量类型更改为SQLNCHAR而不是SQLCHAR,但没有任何效果。在这些情况下,通常发生的是“0行受影响”,而不是错误。任何帮助都将不胜感激。

@Walker我承认我从未使用过大容量插入,但试图设置您的测试用例,只是不断变得不完整或无法读取我已保存的格式文件。无论如何,请尝试在编码-->字符集-->Western Eurpoean-->Windows-1252的记事本++中将编码更改为1252保存文件并尝试导入


另外,我刚刚看到了这篇有趣的文章,它认为UTF-8在SQL 2016之前是一个问题。但是吸引我注意的一个答案是
SQLNCHAR vs SQLCHAR
,因为我认为您存储的是
Unicode
数据,这意味着您需要更改已装箱的格式文件和表中的数据类型。

我回答这个老问题,希望它能让人省去我最近遇到的麻烦通过

简单地说:当使用代码页65001从UTF-8编码文件插入时,您应该在格式文件中使用
排序规则。您必须拥有SQL Server 2016,才能使用代码页65001


请执行以下操作:

  • 在bulk insert语句中指定大容量插入表以UTF-8编码,并使用
    CODEPAGE=65001
  • 在格式化文件中,将字符列类型指定为SQLCHAR
  • 在格式化文件中,对所有列使用“”排序规则
  • 批量插入语句:

    BULK INSERT [Test].[dbo].[rawData]
    FROM 'C:\IMDbRatings\Files\ratings.csv'
    WITH (CODEPAGE = 65001, FIRSTROW = 2, FORMATFILE= 'C:\IMDbRatings\format.fmt');
    
    格式文件:

    13.0
    16
    1       SQLCHAR             0       1000    "\",\""    1     Position                   ""
    2       SQLCHAR             0       1000    "\",\""    2     Const                      ""
    3       SQLCHAR             0       1000    "\",\""    3     Created                    ""
    4       SQLCHAR             0       1000    "\",\""    4     Modified                   ""
    5       SQLCHAR             0       1000    "\",\""    5     Description                ""
    6       SQLCHAR             0       1000    "\",\""    6     Title                      ""
    7       SQLCHAR             0       1000    "\",\""    7     TitleType                  ""
    8       SQLCHAR             0       1000    "\",\""    8     Directors                  ""
    9       SQLCHAR             0       1000    "\",\""    9     YouRated                   ""
    10      SQLCHAR             0       1000    "\",\""    10    IMDbRating                 ""
    11      SQLCHAR             0       1000    "\",\""    11    Runtime                    ""
    12      SQLCHAR             0       1000    "\",\""    12    Year                       ""
    13      SQLCHAR             0       1000    "\",\""    13    Genres                     ""
    14      SQLCHAR             0       1000    "\",\""    14    NumVotes                   ""
    15      SQLCHAR             0       1000    "\",\""    15    ReleaseDate                ""
    16      SQLCHAR             0       1000    "\""     16    URL                        ""
    

    RAW
    排序规则中:

    指定数据存储在指定的代码页中 在命令或bcp_控件BCPFILECP的代码页选项中 暗示如果未指定任何一项,则数据文件的排序规则为 客户端计算机的OEM代码页


    您的数据库、表和列的“Jean-Marc Vallée”名称是什么样的排序规则?这三个名称都使用SQL\u Latin\u General\u CP1\u CI\u,因为您几天前没有发布完全相同的问题吗?导入方法是否必须使用bulkinsert?您是否有可用的SSI?是否尝试在代码页1252的记事本++中保存文件并导入?这建议在Bulkinsert语句中为格式文件或rating.csv使用CODEPAGE='ACP'?我真的很感谢你在这里的帮助。rating.csv我不认为格式文件会有什么不同。我还写了一点。我认为你需要将数据类型修改为nchar,这样你就可以处理Unicode字符,因为char不会存储它们。这当然很奇怪,但编码方面肯定会发生一些事情。你有没有试过在记事本中将编码设置为windows-1252并导入?让我们来看看。谢谢。只需将CODEPAGE=65001添加到批量插入中,即可修复我的所有问题