在T-SQL中取消序列化PHP序列化数据
我正在尝试从Magento订单中提取礼品卡代码。其他一些代码使用MagentoAPI从Magento中以XML形式检索订单信息,并将XML插入MS SQL Server记录中。使用T-SQL,我可以使用XML函数解析从MagentoAPI检索到的XML,并获得几乎所有我需要的内容,但实际礼品卡代码存储的唯一位置是礼品卡字段,它恰好是一个php序列化字符串 示例:在T-SQL中取消序列化PHP序列化数据,php,tsql,magento,serialization,Php,Tsql,Magento,Serialization,我正在尝试从Magento订单中提取礼品卡代码。其他一些代码使用MagentoAPI从Magento中以XML形式检索订单信息,并将XML插入MS SQL Server记录中。使用T-SQL,我可以使用XML函数解析从MagentoAPI检索到的XML,并获得几乎所有我需要的内容,但实际礼品卡代码存储的唯一位置是礼品卡字段,它恰好是一个php序列化字符串 示例: a:1:{i:0;a:5:{s:1:"i";s:1:"1";s:1:"c";s:12:"00XCY8S3ZXCU";s:1:"a";d
a:1:{i:0;a:5:{s:1:"i";s:1:"1";s:1:"c";s:12:"00XCY8S3ZXCU";s:1:"a";d:119;s:2:"ba";d:119;s:10:"authorized";d:119;}}
a:3:{i:0;a:5:{s:1:"i";s:2:"10";s:1:"c";s:12:"045EMJJWRCF1";s:1:"a";d:100;s:2:"ba";d:100;s:10:"authorized";d:100;}i:1;a:5:{s:1:"i";s:2:"11";s:1:"c";s:12:"06DUQ7Z5GVT7";s:1:"a";d:101;s:2:"ba";d:101;s:10:"authorized";d:101;}i:2;a:5:{s:1:"i";s:2:"12";s:1:"c";s:12:"07A6MRYW511J";s:1:"a";d:102;s:2:"ba";d:102;s:10:"authorized";d:102;}}
礼品卡代码是数组中键为“c”的值,例如:
00XCY8S3ZXCU
045EMJJWRCF106DUQ7Z5GVT707A6MRYW511J
我目前正试图通过使用T-SQL函数解析值来解决这个问题,这就像试图用螺丝刀钉钉子一样。显然,以前有人问过这个问题,唯一的建议是在T-SQL中从头开始构建解析器,但是使用PHP来取消序列化是更好的选择
如果Magento不在他们的数据库中存储PHP序列化数据,然后在他们的web服务中以序列化的方式提供数据,那就太好了,但这正是我必须处理的。我会考虑使用C~*来转换它并将其作为数据库中的单独字段存储,但是在T-SQL中解析数据会更方便。如果我使用C#解析和取消序列化PHP对象,我可能会将其作为XML存储回数据库中,因为这是一种更好的数据交换格式。这是我自己能够想到的。我受到了一个很好的解决方案的鼓舞,并决定为序列化的php对象解决这个问题。但我采取了完全不同的方法 更新的代码示例如下。 序列化的php字符串:
a:3:{
i:0;
a:5:{
s:1:"i";
s:2:"10";
s:1:"c";
s:12:"045EMJJWRCF1";
s:1:"a";
d:100;
s:2:"ba";
d:100;
s:10:"authorized";
d:100;
}
i:1;
a:5:{
s:1:"i";
s:2:"11";
s:1:"c";
s:12:"06DUQ7Z5GVT7";
s:1:"a";
d:101;
s:2:"ba";
d:101;
s:10:"authorized";
d:101;
}
i:2;
a:5:{
s:1:"i";
s:2:"12";
s:1:"c";
s:12:"07A6MRYW511J";
s:1:"a";
d:102;
s:2:"ba";
d:102;
s:10:"authorized";
d:102;
}
}
获取结果的查询:
select *
from parsePhpSerializedString('a:3:{i:0;a:5:{s:1:"i";s:2:"10";s:1:"c";s:12:"045EMJJWRCF1";s:1:"a";d:100;s:2:"ba";d:100;s:10:"authorized";d:100;}i:1;a:5:{s:1:"i";s:2:"11";s:1:"c";s:12:"06DUQ7Z5GVT7";s:1:"a";d:101;s:2:"ba";d:101;s:10:"authorized";d:101;}i:2;a:5:{s:1:"i";s:2:"12";s:1:"c";s:12:"07A6MRYW511J";s:1:"a";d:102;s:2:"ba";d:102;s:10:"authorized";d:102;}}')
查询结果如下:
element_id parent_id var_name var_type var_length value_int value_string value_decimal
----------- ----------- -------------------------------------------------- -------------------------------------------------- ----------- ----------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------
1 0 NULL a 3 NULL NULL NULL
2 1 0 a 5 NULL NULL NULL
3 1 1 a 5 NULL NULL NULL
4 1 2 a 5 NULL NULL NULL
5 2 i s 2 NULL 10 NULL
6 2 c s 12 NULL 045EMJJWRCF1 NULL
7 2 a d NULL NULL NULL 100
8 2 ba d NULL NULL NULL 100
9 2 authorized d NULL NULL NULL 100
10 3 i s 2 NULL 11 NULL
11 3 c s 12 NULL 06DUQ7Z5GVT7 NULL
12 3 a d NULL NULL NULL 101
13 3 ba d NULL NULL NULL 101
14 3 authorized d NULL NULL NULL 101
15 4 i s 2 NULL 12 NULL
16 4 c s 12 NULL 07A6MRYW511J NULL
17 4 a d NULL NULL NULL 102
18 4 ba d NULL NULL NULL 102
19 4 authorized d NULL NULL NULL 102
如果我只需要礼品卡代码,我可以编写如下查询:
select value_string
from parsePhpSerializedString('a:3:{i:0;a:5:{s:1:"i";s:2:"10";s:1:"c";s:12:"045EMJJWRCF1";s:1:"a";d:100;s:2:"ba";d:100;s:10:"authorized";d:100;}i:1;a:5:{s:1:"i";s:2:"11";s:1:"c";s:12:"06DUQ7Z5GVT7";s:1:"a";d:101;s:2:"ba";d:101;s:10:"authorized";d:101;}i:2;a:5:{s:1:"i";s:2:"12";s:1:"c";s:12:"07A6MRYW511J";s:1:"a";d:102;s:2:"ba";d:102;s:10:"authorized";d:102;}}')
where parent_id != 0 and
var_name = 'c'
结果:
value_string
-------------
045EMJJWRCF1
06DUQ7Z5GVT7
07A6MRYW511J
下面是解析序列化PHP字符串的T-SQL函数:
IF OBJECT_ID (N'dbo.parsePhpSerializedString') IS NOT NULL
DROP FUNCTION dbo.parsePhpSerializedString
GO
CREATE FUNCTION dbo.parsePhpSerializedString( @phpSerialized VARCHAR(MAX))
RETURNS @results table
(
element_id int identity(1,1) not null, /* internal surrogate primary key gives the order of parsing and the list order */
parent_id int, /* if the element has a parent then it is in this column. */
var_name varchar(50), /* the name or key of the element in a key/value array list */
var_type varchar(50),
var_length int,
value_int int,
value_string varchar(max),
value_decimal numeric
)
AS
BEGIN
/*
Built by Matt Johnson (matt@evdat.com) 2012-08-14
*/
-- we use this table later for collecting auto generated
-- identity values when inserting records into @results
declare @insertedIds table (
element_id int
)
-- define variables
declare @element_start int
declare @var_type_end int
declare @var_type varchar(50)
declare @element_end int
declare @chunk varchar(max)
declare @var_length_start int
declare @var_length_end int
declare @var_length_string varchar(max)
declare @var_length int
declare @value_start int
declare @value_end int
declare @value_string varchar(max)
declare @value_int int
declare @value_decimal numeric
declare @array_level int
declare @value_string_position int
declare @next_open int
declare @next_close int
declare @parent_id int
declare @element_id int
declare @key_element_id int
declare @inserted_element_id int
declare @var_name varchar(50)
--initialize variables
set @parent_id = 0
--loop through the supplied @phpSerialized string until it's empty
while 1=1 begin
set @element_start = null
set @var_type_end = null
set @var_type = null
set @element_end = null
set @chunk = null
set @var_length_start = null
set @var_length_end = null
set @var_length_string = null
set @var_length = null
set @value_start = null
set @value_end = null
set @value_string = null
set @value_int = null
set @value_decimal = null
set @array_level = null
set @value_string_position = null
set @next_open = null
set @next_close = null
set @var_name = null
--confirm that there is an element to parse and define its starting point
--patindex will return a value of 1 if the pattern is found and this pattern
--will only match if the element starting point is the first character in the
--supplied string. If it is encapsulated in quotes or anything else it will not match
set @element_start = patindex('[asid]:%[;}]', @phpSerialized)
if @element_start <= 0 begin
--if the supplied string is now empty check the existing results table
--for any nested elements in any array elements
--reset the value of @element_id to be safe
set @element_id = null
--only retrieve the first element found containing sub elements to parse
select top 1
@phpSerialized = value_string,
@element_id = element_id
from @results
where var_type = 'a' and
value_string is not null
--set the parent_id to the array's element_id
set @parent_id = @element_id
--if there were no results found then that means there either
--were no arrays to parse, or all arrays have already been parsed
--so break the continuous loop because we are completely done now
if @element_id is null break
--set the @element_start again now that we
--have a new string to parse for elements
set @element_start = patindex('[asid]:%[;}]', @phpSerialized)
end
--find the end of the type of the element then extract the variable type from the string
set @var_type_end = patindex('%:%', @phpSerialized)
set @var_type = substring(@phpSerialized, @element_start, @var_type_end-@element_start)
--generate an error if a variable type is supplied that hasn't been coded for.
if @var_type not like '[asid]' begin
/*
print @var_type
RAISERROR (N'Error parsing php serialized string. Variable type found that has not been defined to parse for.', -- Message text.
16, -- Severity,
1 -- State
)
*/
--apparently errors can't be raised within a function so skip the element
break
end
--array elements contain sub elements so we use different methods for parsing
--sub elements than we do for parsing individual elements.
if @var_type != 'a' begin
--element has no sub elements
--determine the end of this individual element and then extract
--only this individual element from the string
set @element_end = patindex('%;%', @phpSerialized)+1
set @chunk = substring(@phpSerialized, @element_start, @element_end-@element_start)
--strings are serialized differently than numeric elements
if @var_type = 's' begin
--element has var length
--find the starting and ending positions for the var_length and then extract the length
set @var_length_start = @var_type_end+1
set @var_length_end = patindex('%:%', substring(@chunk, @var_length_start, len(@chunk))) + @var_length_start - 1
set @var_length_string = substring(@chunk, @var_length_start, @var_length_end-@var_length_start)
if @var_length_string not like '[^0-9]' begin
--its nice to verify this is actually a number before casting it as such
set @var_length = cast(@var_length_string as int)
end
--find the starting and ending positions for the value and then extract the value
set @value_start = @var_length_end+1
set @value_end = patindex('%;%', @chunk)
--a string value is quoted so remove quotes in start and end of substring for value
--we set the substring starting position +1 just past the start of the quote and then
--set the length of the extracted string -2 to account for both the starting quote and
--ending quote to be removed from the extracted string.
set @value_string = substring(@chunk, @value_start+1, @value_end-@value_start-2)
end else begin
--element does not have a var length
--find the starting and ending positions for the value and then extract the value as a string
set @value_start = @var_type_end+1
set @value_end = patindex('%;%', @chunk)
set @value_string = substring(@chunk, @value_start, @value_end-@value_start)
--determine what value type the string should be converted to
if @var_type = 'i' begin
if @value_string not like '[^0-9.]' begin
set @value_int = cast(@value_string as int)
--clear the value_string because the element's value has been converted to its appropriate type
set @value_string = null
end
end else if @var_type = 'd' begin
if @value_string not like '[^0-9.]' begin
set @value_decimal = cast(@value_string as numeric)
--clear the value_string because the element's value has been converted to its appropriate type
set @value_string = null
end
end
end
end else begin
--element is array and has sub elements
--we are going to chop up the string to try and determine its end so we'll
--first set the string to a variable we can destroy in this process
set @chunk = @phpSerialized
--find the starting and ending positions for the var_length and then extract the length
--arrays use this to state how may elements this array contains
set @var_length_start = @var_type_end+1
set @var_length_end = patindex('%:%', substring(@chunk, @var_length_start, len(@chunk))) + @var_length_start - 1
set @var_length_string = substring(@chunk, @var_length_start, @var_length_end-@var_length_start)
if @var_length_string not like '[^0-9]' begin
set @var_length = cast(@var_length_string as int)
end
--find the value starting position
--later we will find the true end of the value
set @value_start = @var_length_end+1
-- to determine the ending position we have to dig through the sub elements and track the
-- nested level to identify the ending brace for this level
set @array_level = 0
--we start the string position at 1 for the begining of the serialized string
set @value_string_position = 1
-- loop through the value chopping up the chunk while trying to find the ending brace for this array
while 1=1 begin
--find the next open and close braces in the chunk
set @next_open = patindex('%{%', @chunk)
set @next_close = patindex('%}%', @chunk)
--check to see which brace is the next in the chunk
if @next_open > 0 and @next_open < @next_close begin
--found an opening brace
--since this is an opening brace we need to increment the level and strip off
--everything from the chunk before the brace so that we can search for additional braces
--we also note the position in the string for use in finding the end of the value later
--we track the previous position and add to it because we keep chopping off the beginning of
--the chunk as we parse through the string, and later we will need to reference the position
--relative to the entire serialized string.
set @value_string_position = @value_string_position + patindex('%{%', @chunk)-1
set @chunk = substring(@chunk, patindex('%{%', @chunk)+1, len(@chunk))
set @array_level = @array_level + 1
end else if @next_close > 0 begin
--found a closing brace
--print 'found close at level: ' + cast(@array_level as varchar(10)) + '(' + cast(patindex('%}%', @chunk) as varchar(10)) + ')'
--since this is a closing brace we need to decrement the level and strip off
--everything from the chunk before the brace so that we can search for additional braces
--we also note the position in the string for use in finding the end of the value later
--we track the previous position and add to it because we keep chopping off the beginning of
--the chunk as we parse through the string, and later we will need to reference the position
--relative to the entire serialized string.
set @value_string_position = @value_string_position + patindex('%}%', @chunk)+1
set @chunk = substring(@chunk, patindex('%}%', @chunk)+1, len(@chunk))
set @array_level = @array_level - 1
end else break
--once we get back to level 0 we know we've found the end of this array element
--so break the continuous loop now that we have the ending position
if @array_level <= 0 break
end
--set the ending position of the element and the value since the value is the last part of the element
set @element_end = @value_string_position
set @value_end = @element_end
--an array value is surrounded by braces so remove the braces in start and end of the substring value
--we set the substring starting position +1 just past the start of the opening brace and then
--set the length of the extracted string -2 to account for both the opening brace and
--closing brace to be removed from the extracted string.
set @value_string = substring(@phpSerialized, @value_start+1, @value_end-@value_start-2)
set @chunk = substring(@phpSerialized, @element_start, @element_end-@element_start)
-- if the array is empty just set it to null so that
-- we don't try and parse the contents of the array value later.
if @value_string = '' set @value_string = null
end
--we populate the results table differently depending on the element that is being parsed.
--Any element contained in an array has a key element and a value element. Though we parse
--all key elements in an array all we do with them in the results table is set their value
--as the var_name (key) for the element, and store the key_element_id for setting the value
--in the next pass since key/value pairs are listed sequentially in a serialized string.
if @parent_id > 0 and @key_element_id > 0 begin
--parent_id > 0 indicates this is a sub element inside an array
--and the array contents currently being parsed contain key elements and value
--elements sequentially in the serialized string. Because key_element_id is not 0
--we know this must be the value part of the element contained in the array as the
--key was just previously defined.
--update the existing keyed element with it's type, length and value
update @results
set var_type = @var_type,
var_length = @var_length,
value_string = @value_string,
value_int = @value_int,
value_decimal = @value_decimal
where element_id = @key_element_id
set @key_element_id = null
end else if @parent_id > 0 begin
--this element is a part of the array which contains key/value paris and since
--the @key_element_id is 0 or not defined we can asume this is the key of the pair
--determine what the key element type is and cast it as a string to the var_name
if @var_type = 'i' set @var_name = cast(@value_int as varchar(50))
if @var_type = 'd' set @var_name = cast(@value_decimal as varchar(50))
if @var_type = 's' set @var_name = @value_string
--here we are assuming that all keys should be of type s, d, or i... so we if get anything
--other than that throw an error since this code is not setup to handle that.
if @var_type not like '[sid]' begin
/*
print @var_type
RAISERROR (N'Error parsing php serialized string. Variable type found that should be akey but is not of type s, d, or i.', -- Message text.
16, -- Severity,
1 -- State
)
*/
--apparently errors can't be raised within a function so skip the element
break
end
--insert a new record into the results table defining the parent_id and var_name
insert @results
(
parent_id,
var_name
)
output inserted.element_id into @insertedIds
values (
@parent_id,
@var_name
)
--since we stored the identity value in the output set that value to @inserted_element_id
select top 1 @inserted_element_id = element_id from @insertedIds
delete from @insertedIds
--set the key_element_id so that the next pass catches the value and assigns it to this result record
set @key_element_id = @inserted_element_id
end else begin
--this will be executed for parsed strings that are not part of an array
--in which case the elements do not contain value_name keys
--insert the entire element details into the results table
insert into @results
(
parent_id,
var_type,
var_length,
value_string,
value_int,
value_decimal
)
output inserted.element_id into @insertedIds
values (
@parent_id,
@var_type,
@var_length,
@value_string,
@value_int,
@value_decimal
)
--here we capture the identiy value for the inserted record
select top 1 @inserted_element_id = element_id from @insertedIds
delete from @insertedIds
end
-- if the current php serilized string was an array then it would have
-- been parsed and elements added to the results, so we should remove
-- the value_string from the array element in the results table so that
-- it doesn't get parsed again in the loop.
if @element_id is not null begin
--the only strings that get parsed where the @element_id has
--a value is from an array element
update @results
set value_string = null
where element_id = @element_id
end
--since we have parsed this element from the serialized string chop off this element
--from the string and run the rest of it through the loop again to ensure all
--elements have been parsed from the supplied serialized string.
set @phpSerialized = substring(@phpSerialized, @element_end, len(@phpSerialized))
end
return
end
如果对象_ID(N'dbo.parsePhpSerializedString')不是NULL
DROP函数dbo.parsePhpSerializedString
去
创建函数dbo.parsePhpSerializedString(@phpSerialized VARCHAR(MAX))
返回@results表
(
元素_id int identity(1,1)not null,/*内部代理主键给出解析顺序和列表顺序*/
parent_id int,/*如果元素有父元素,则它位于该列中*/
var_name varchar(50),/*键/值数组列表中元素的名称或键*/
var_型varchar(50),
var_长度int,
值,
值\字符串varchar(最大值),
十进制数值
)
作为
开始
/*
马特·约翰逊建造(matt@evdat.com) 2012-08-14
*/
--我们稍后使用此表收集自动生成的
--将记录插入@results时的标识值
声明@insertedIds表(
元素id int
)
--定义变量
声明@element\u start int
声明@var\u type\u end int
声明@var_类型varchar(50)
声明@element\u end int
声明@chunk varchar(最大值)
声明@var\u length\u start int
声明@var\u length\u end int
声明@var\u length\u字符串varchar(最大值)
声明@var_长度int
声明@value\u start int
声明@value\u end int
声明@value\u字符串varchar(最大值)
声明@value\u int
声明@value\u十进制数字
声明@array_级别int
声明@value\u字符串\u位置int
声明@next_open int
声明@next_close int
声明@parent_id int
声明@element_id int
声明@key\u元素\u id int
声明@inserted\u element\u id int
声明@var_name varchar(50)
--初始化变量
设置@parent_id=0
--循环提供的@phpSerialized字符串,直到它为空
当1=1开始时
设置@element\u start=null
设置@var\u type\u end=null
设置@var_type=null
设置@element_end=null
设置@chunk=null
设置@var\u length\u start=null
设置@var\u length\u end=null
设置@var\u length\u string=null
设置@var_length=null
设置@value\u start=null
设置@value\u end=null
设置@value\u字符串=null
设置@value\u int=null
设置@value\u decimal=null
设置@array_level=null
设置@value\u string\u position=null
设置@next_open=null
设置@next_close=null
设置@var_name=null
--确认存在要解析的元素并定义其起点
--如果找到该模式,并且该模式无效,则patindex将返回值1
--仅当元素起点是元素中的第一个字符时才匹配
--提供的字符串。如果它被封装在引号或其他任何东西中,它将不匹配
设置@element_start=patindex(“[asid]:%[;}]”,@phpSerialized)
如果@element\u开始0和@next\u打开<@next\u关闭开始
--找到一个大括号
--由于这是一个开口支撑,我们需要增加标高并剥离
--从大括号前的块开始的所有内容,以便我们可以搜索其他大括号
--我们还注意字符串中的位置,以便在以后查找值的结尾时使用
--我们跟踪上一个位置,并添加到它,因为我们不断切断的开始
--当我们通过字符串进行解析时,我们将需要引用该位置
--相对于整个序列化字符串。
设置@value\u string\u position=@value\u string\u position+patindex(“%{%”,@chunk)-1
set@chunk=substring(@chunk,patindex('%{%',@chunk)+1,len(@chunk))
设置@array\u level=@array\u level+1
如果@next_close>0开始,则结束else
--找到一个右括号
--“打印”可在以下位置找到:
CREATE FUNCTION extractValueFromSerializedPhpString(
@attribute_name nvarchar(100), @serialized_value nvarchar(max)
) RETURNS nvarchar(1024)
BEGIN
DECLARE @attribute_value_length int, @attribute_value_length_string nvarchar(5)
SELECT @attribute_value_length_string = REPLACE(REPLACE(SUBSTRING(@serialized_value,CHARINDEX(@attribute_name,@serialized_value)+LEN(@attribute_name)+4,3),':',''),'"','')
IF ISNUMERIC(@attribute_value_length_string) = 1 BEGIN
SELECT @attribute_value_length = CAST(@attribute_value_length_string AS int)
RETURN SUBSTRING(@serialized_value,CHARINDEX(@attribute_name,@serialized_value)+LEN(@attribute_name)+CASE WHEN @attribute_value_length > 99 THEN 9 WHEN @attribute_value_length > 9 THEN 8 ELSE 7 END,@attribute_value_length)
END
RETURN NULL
END