解析address:T-SQL代码来模拟这8行Perl?
我刚刚从另一个程序中得到一个未解析的地址,我需要将其作为其组件存储在接收系统中。我需要一些帮助!我给你的猫洗澡。什么都行 好消息是我可以指望那些在线订阅。我可以用逗号+空格来表示城市,我可以用两位数的缩写来表示州或省,后跟空格。所以(不打高尔夫球)我很快用Perl编写了它来提供一些工作代码 关键是,如果我们在上拆分输入,\n我只需要第二行/元素(地址1)、最后一行/元素(国家)和倒数第二个元素(城市,ST zip)。然后我需要将该元素拆分为它的组件。下面的Perl代码可以工作,但如何在T-SQL中重新创建它解析address:T-SQL代码来模拟这8行Perl?,sql,perl,tsql,parsing,Sql,Perl,Tsql,Parsing,我刚刚从另一个程序中得到一个未解析的地址,我需要将其作为其组件存储在接收系统中。我需要一些帮助!我给你的猫洗澡。什么都行 好消息是我可以指望那些在线订阅。我可以用逗号+空格来表示城市,我可以用两位数的缩写来表示州或省,后跟空格。所以(不打高尔夫球)我很快用Perl编写了它来提供一些工作代码 关键是,如果我们在上拆分输入,\n我只需要第二行/元素(地址1)、最后一行/元素(国家)和倒数第二个元素(城市,ST zip)。然后我需要将该元素拆分为它的组件。下面的Perl代码可以工作,但如何在T-SQL
$_ = "Company\n".
"Address 1\n".
"Address 2 (opt)\n".
"Address 3 (opt)\n".
"City, ST zip\n".
"Country";
# also works for "City, PV zip zip\n"
@add = split('\n');
$address = $add[1]; # who cares about addy and addy3
$country = pop(@add);
$ctz = pop(@add);
if ($ctz =~ /(.*), (..) (.*)/) {
# Yes a $ctz line like "City of Angels, II, MO 65423" would break it
$city = $1;
$state = $2;
$zip = $3;
} else {
$city = $state = $zip = '';
}
print "Address: $address\n".
"City: $city\n".
"State Code: $state\n".
"Zip: $zip\n".
"Country: $country\n";
从大量借用的字符串拆分器开始,但它处理多字符分隔符。它按顺序返回分隔的项目,并带有索引列:
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter VARCHAR(16))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+ Len( @pDelimiter ) FROM cteTally t WHERE SUBSTRING(@pString,t.N, Len( @pDelimiter ) ) = @pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1 ,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l;
然后将其释放到您的数据上:
declare @Newline as Char(2) = Char(13) + Char(10); -- This may need work to match your newlines.
declare @Sample as VarChar(1024) =
'Company' + @Newline +
'Address 1' + @Newline +
'Address 2 (opt)' + @Newline +
'Address 3 (opt)' + @Newline +
'City, ST zip' + @Newline +
'Country';
select *
from dbo.DelimitedSplit8K( @Sample, @Newline );
作为一个练习,左图是了解如何处理可选项
出于好奇。对不起,我的错是,我以为您需要perl代码——但您要求的是T-SQL代码 将代码留给可能感兴趣的陌生人 调查以下代码是否符合您的任务
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my @data = <DATA>;
my $address;
chomp @data;
$address->{company} = $data[0];
push @{$address->{street}}, @data[1..$#data-2];
$address->@{qw/city state zip/} = split '[, ]+', $data[-2];
$address->{country} = $data[-1];
say Dumper($address);
say '--- Print the address ' . '-' x 25;
my @fields = keys %{ $address };
for my $field ( @fields ) {
say ucfirst $field . ": " .
(
ref $address->{$field} eq 'ARRAY'
? join "\n\t", @{ $address->{$field} }
: $address->{$field}
);
}
__DATA__
Company
Address 1
Address 2 (opt)
Address 3 (opt)
City, ST zip
Country
这可能有点难看,但我今天自学了TSQL
事实上,这个正则表达式将解析《天使之城》(City of Angels,II,MO 65423)非常好。我肯定有个奇怪的城市名字会让我们失望。但那个人只是说拿最后一行,从第二行到最后一行分开,然后拿tsql中的第二行,这让我发疯。积分器大约有50行代码,但它失败了。我付给某人5美元,但他的代码也不起作用,也不值得为20美元和他争吵。有人吗?你试过正则表达式吗?--我发现了很多例子。我肯定还有更多的医生。如果是,是什么给你带来了特别的麻烦(我没有从前面的评论中得到描述,关于“最后一行”然后是“最后一秒”然后是“第二秒”…你能显示准确的行吗?(或者更好的是,你的
SQL
尝试?)
$VAR1 = {
'city' => 'City',
'country' => 'Country',
'company' => 'Company',
'state' => 'ST',
'zip' => 'zip',
'street' => [
'Address 1',
'Address 2 (opt)',
'Address 3 (opt)'
]
};
--- Print the address -------------------------
Street: Address 1
Address 2 (opt)
Address 3 (opt)
City: City
State: ST
Zip: zip
Country: Country
declare @string varchar(2000), @ctz varchar(100), @delim varchar(1), @idx integer;
set @delim = CHAR(10); -- What we get from BC
set @string = 'Company'+@delim+'Address1'+@delim+'Address2'+@delim+'City, ST Zip'+@delim+'Country'; -- We we get from BC
--set @string = 'Company'+@delim+'Address1'+@delim+'City, ST Zip'+@delim+'Country'; -- We we get from BC
--set @string = 'Company'+@delim+'Address1'+@delim+'Address2'+@delim++'Address3'+@delim+'City, PR zip zip'+@delim+'Country'; -- We we get from BC
-- Start from the bottom
select @idx = LEN(@string) - CHARINDEX(@delim,REVERSE(@string)) + 1; -- last occurance of our delim
select SUBSTRING(@string,@idx+1,2000) as country;
select @string = SUBSTRING(@string,1,@idx-1); -- shorten our string now including the delim
select @idx = LEN(@string) - CHARINDEX(@delim,REVERSE(@string)) + 1;
select @ctz = SUBSTRING(@string,@idx+1,2000); -- deal with this later
-- select @ctz as ctz;
select @string = SUBSTRING(@string,1,@idx-1); -- shorten it again including the delim
-- Now start at the top to remove company
select @idx = CHARINDEX(@delim, @string); -- first occurance of delim
select @string = SUBSTRING(@string,@idx+1,2000); -- just remove everything up to that point (Company)
select @idx = CHARINDEX(@delim, @string); -- first occurance at end of add1
if @idx = 0
select @string as address1;
else
BEGIN
select SUBSTRING(@string,1,@idx-1) as address1;
select @string = SUBSTRING(@string,@idx+1,2000); -- keep shortening
select replace(@string, @delim, ',') as address2; -- if there anything else
END
select @idx = PATINDEX('%, [A-z][A-Z] %',@ctz); -- A regexp to find ", ST "
select SUBSTRING(@ctz,1,@idx-1) as city;
select SUBSTRING(@ctz,@idx+2,2) as st;
select SUBSTRING(@ctz,@idx+1+2+1,100) as zip; -- index+space+state+space
GO