C# 用C处理固定宽度的源文件#
问题 当前数据C# 用C处理固定宽度的源文件#,c#,sql,ssis,C#,Sql,Ssis,问题 当前数据 ........Column 1....Column 2.......Column3....Column 4 Row1...........0...........0.............0...........Y Row2.......3142.56...........500............0...........N Row3.......3142.56...........500............0...........N 源文件具有固
........Column 1....Column 2.......Column3....Column 4
Row1...........0...........0.............0...........Y
Row2.......3142.56...........500............0...........N
Row3.......3142.56...........500............0...........N
源文件具有固定宽度的列
导出固定宽度列的程序不会将小数点后的数字作为保留的固定宽度大小的一部分
- 第1行是正常输出,工作正常
- 第2行和第3行有2位小数,因此第2、3、4列。。。都被推了两个位置
// Resolve Decimal Issues
foreach (object Column in splitLine)
{
String CurrentColumn = Column.ToString();
if (Regex.Match(CurrentColumn, @"^[0-9]+(\.[0-9]+)?$").Success == true)
{
// Count how many numbers AFTER a decimal
int decimalLength = CurrentColumn.Substring(CurrentColumn.IndexOf(".")).Length;
if (decimalLength >= 1)
{
// Remove this amount of places from the start of the string
CurrentColumn = CurrentColumn.Substring(CurrentColumn.Length - decimalLength);
}
}
//Start re-joining the string
newLine = newLine + CurrentColumn + "\t";
}
问题是IndexOf在没有找到匹配项时返回了-1,从而导致了一个错误
错误堆栈
Error: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
---> System.ArgumentOutOfRangeException: StartIndex cannot be less than zero.
Parameter name: startIndex
at System.String.InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy)
at ST_dd38f3d289db4495bf07257723356ed3.csproj.ScriptMain.Main()
--- End of inner exception stack trace ---
at System.RuntimeMethodHandle._InvokeMethodFast(Object target, Object[] arguments, SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeTypeHandle typeOwner)
at System.RuntimeMethodHandle.InvokeMethodFast(Object target, Object[] arguments, Signature sig, MethodAttributes methodAttributes, RuntimeTypeHandle typeOwner)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
at System.RuntimeType.InvokeMember(String name, BindingFlags bindingFlags, Binder binder, Object target, Object[] providedArgs, ParameterModifier[] modifiers, CultureInfo culture, String[] namedParams)
at System.Type.InvokeMember(String name, BindingFlags invokeAttr, Binder binder, Object target, Object[] args, CultureInfo culture)
at Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTATaskScriptingEngine.ExecuteScript()
所以我有点困惑我能做些什么来解决这个问题。我认为我走的是正确的道路。。但最后一个错误让我有点不知所措。试试这个:
// Resolve Decimal Issues
foreach (object Column in splitLine)
{
String CurrentColumn = Column.ToString();
char[] s = {'.'};
if (Regex.Match(CurrentColumn, @"^[0-9]+(\.[0-9]+)?$").Success && CurrentColumn.Contains('.'))
{
// Count how many numbers AFTER a decimal
int decimalLength = CurrentColumn.split(s, StringSplitOptions.None)[1].Length;
if (decimalLength >= 1)
{
// Remove this amount of places from the start of the string
CurrentColumn = CurrentColumn.Substring(CurrentColumn.Length - decimalLength);
}
}
//Start re-joining the string
newLine = newLine + CurrentColumn + "\t";
}
下面是一个简短、密集和有限的方法。无需寻找任何东西,只需拆分、打包、填充和重建。这实际上(我刚刚注意到)适用于任何将成为固定宽度的文本文件
// "inputData" is assumed to contain the whole source file
const int desiredFixedWidth = 12; // How wide do you want your columns ?
const char paddingChar = ' '; // What char do you want to pad your columns with?
// Step 1: Split the lines
var srcLines = inputData.Split(new string[]{Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries);
// Step 2: Split up each line, ditch extra chars, pad the values, rebuild the file
var outLines = srcLines.Select(s =>
string.Join(paddingChar.ToString(),
s.Split(new string[] { paddingChar.ToString() }, StringSplitOptions.RemoveEmptyEntries)
.Select(l => l.PadLeft(desiredFixedWidth, paddingChar))));
另一方面,需要修复断开文件的“生成器”以符合您想要的宽度…我认为您的逻辑有缺陷 给定
bbbb123.45
(b
是一个空格),您的逻辑将给出decimalLength
值3CurrentColumn.Substring(CurrentColumn.Length-decimalLength)
将返回.45
您真正需要的是CurrentColumn.Substring(decimalLength)
,它将从第三个字符开始并返回b123.45
方法大致相同:
// Resolve Decimal Issues
foreach (object Column in splitLine)
{
String CurrentColumn = Column.ToString();
if (Regex.IsMatch(CurrentColumn, @"^[0-9]+(\.[0-9]+)?$"))
{
// If there's a decimal point, remove characters from the front
// of the string to compensate for the decimal portion.
int decimalPos = CurrentColumn.IndexOf(".");
if (decimalPos != -1)
{
CurrentColumn = CurrentColumn.Substring(CurrentColumn.Length - decimalPos);
}
}
//Start re-joining the string
newLine = newLine + CurrentColumn + "\t";
}
顺便说一句,如果小数部分的长度超过字符串前面的空格数,那么这种方法会非常失败。根据你的描述,我认为这不是问题。但是要记住这一点。好吧,当你的列中没有小数点时,小数点后有多少个数字?因此,添加调用
IndexOf
的代码,如果结果是-1
,则退出。例如:intpos=CurrentColumn.IndexOf(“.”);如果(pos!=-1){/*做其余的*/}
,那么在计算decimalLength
时使用pos
。这将允许像“123.abc”
这样的字符串,而OP则不会。我假设这些字符串保证是数字。修改代码谢谢吉姆!休息了一天,头脑清醒,你的回答让我解决了这个问题。我的正则表达式做得不正确,但你的代码更干净。我想我需要更多地了解正则表达式。是的,这不是一个理想的解决办法。但这只是一个快速的胜利,同时我们解决了系统问题(听起来他们需要更长的时间来解决)。