Sas 在Proc Sql中的多个步骤中分解多个左连接

Sas 在Proc Sql中的多个步骤中分解多个左连接,sas,proc-sql,Sas,Proc Sql,我得到了一个代码,它使用了很多表的左连接。当我运行这段代码时,运行需要一个多小时,最后它给出了排序执行失败的错误。所以,我正在考虑分多个步骤来分解这个左连接,但我不知道如何做,需要您的帮助 守则如下: Proc sql; create table newlib.Final_test as SELECT POpener.Name as Client, Popener.PartyId as Account_Number, Case When BalLoc.ConvertedRefNo NE

我得到了一个代码,它使用了很多表的左连接。当我运行这段代码时,运行需要一个多小时,最后它给出了排序执行失败的错误。所以,我正在考虑分多个步骤来分解这个左连接,但我不知道如何做,需要您的帮助

守则如下:

Proc sql;
create table newlib.Final_test as 
SELECT 
POpener.Name as Client,
Popener.PartyId as Account_Number,
Case
  When BalLoc.ConvertedRefNo NE '' then BalLoc.ConvertedRefNo
else BalLoc.Ourreferencenum
End as LC_Number,
BalLoc.OurReferenceNum ,
BalLoc.CnvLiabilityCode as Liability_Code,
POfficer.PartyID as Officer_Num,
POfficer.Name as Officer_Name,
POpener.ExpenseCode,
BalLoc.IssueDate as Issue_Date format=mmddyy10.,
BalLoc.ExpirationDate AS Expiry format=mmddyy10.,
BalLoc.LiabilityAmountBase as Total_LC_Balance,
Case
  When BalLoc.Syndicated = 0 Then BalLoc.LiabilityAmountBase
    else 0
End as SunTrust_Non_Syndicated_Exposure,
Case 
  When BalLoc.Syndicated = 1 and BalLoc.PartOutGroupPkey NE 0 Then    
BalLoc.LiabilityAmountBase
    else 0
  End as SunTrust_Syndicated_Exposure,
Case 
  When BalLoc.Syndicated = 1 and BalLoc.PartOutGroupPkey NE 0 Then   
(BalLoc.LiabilityAmountBase - (BalLoc.LiabilityAmountBase *   
(PParty.ParticipationPercent/100)))
  Else BalLoc.LiabilityAmountBase 
End as SunTrust_Exposure,
Case
  When BalLoc.Syndicated = 1 and BalLoc.PartOutGroupPkey <> 0 Then   
(BalLoc.LiabilityAmountBase  * PParty.ParticipationPercent/100)
  Else 0
End as Exposure_Held_By_Other_Banks,
PBene.Name as Beneficiary_Trustee,
cat(put(input(POpener.ObligorNumber,best10.),z10.),put(input 

   (BalLoc.CommitmentNumber,best10.),Z10.)) as Key,
case
when BalLoc.BeneCusip2 NE ' ' then catx 
('|',Balloc.BeneCusip,Balloc.BeneCusip2)
else BalLoc.BeneCusip
End as Cusip,
Case 
  when balLoc.OKtoExpire = 1 then '0' 
  when balLOc.OKtoExpire=0 and BalLoc.AutoExtTermDays NE 0 then put  
(Balloc.AutoExtTermDays,z3.)
  when balLoc.OKtoExpire=0 and BalLoc.AutoExtTermsMonth NE 0 then put  
(balloc.AutoExtTermsMonth,z3.)
  else '000'
End as Evergreen
Case 
when blf.AnnualRate NE 0 then put(blf.AnnualRate,z7.)
when blf.Amount NE 0 then cats('F',put(blf.amount,z7.))
else 'WAIVE'
End as Pricing,

FROM BalLocPrimary BalLoc
Left JOIN Party POpener on POpener.Pkey = BalLoc.OpenerPkey
Left join PartGroup PGroup on BallOC.PartOutGroupPkey = PGroup.pKey
Left join PartParties PParty ON PGroup.pKey = PParty.PartGroupPkey and   
PParty.ParticipationPercent > 0 and
PParty.combined in
(select PPartParties.All_combined  
from PPartParties /*group by PartGroupPkey, PartyPkey*/)

Left Join MemExpenseCodes ExpCodes on POpener.ExpenseCode = ExpCodes.Code
Left JOIN Party PBene on PBene.Pkey = BalLoc.BenePkey
Left join Party POfficer on POfficer.Pkey = BalLoc.AccountOfficerPkey 
left join maxfee on maxfee.LocPrimaryPkey = BalLoc.LocPrimaryPkey
left join BalLocFee BLF on BLF.Pkey = maxfee.pkey
Where BalLoc.LetterType not in ('STBA','EXPA', 'FEE',' ') and 
 BalLoc.LiabilityAmountBase > 0 and BalLoc.irdb = 1
;
quit;
谢谢,


香卡尔

我有几点建议:

1,对于引用的每个数据集,只保留需要连接的变量或SELECT语句中使用的变量。例如,从您的Party数据集来看,您似乎只需要Pkey字段和名称。因此,当您加入该数据集时,应使用:

Left JOIN Party(keep=Pkey Name) PBene on PBene.Pkey = BalLoc.BenePkey
2、将WHERE语句推入FROM语句,如下所示:

FROM BalLocPrimary(where=(LetterType not in ('STBA','EXPA', 'FEE',' ') and 
 LiabilityAmountBase > 0 and irdb = 1)) BalLoc
并确保条件的顺序为最常见到最不禁止这3个字段上的任何索引

3,您将驶离BalLocPrimary数据集,并与其他所有数据集连接。这就是你真正的意图吗?您的结果集返回时没有客户或帐号,可以吗?左连接在计算上可能非常昂贵,并且可以将其最小化的次数越多越好

4,Joe询问了关于连接字段的索引。你应该吃点。我发现自己经常引用,足以将其添加到书签中。类似地,您可以从查询中查看解释计划,以查看它可能在哪里遇到瓶颈。这将是一个良好的开端


5.你说得对,这可能是真的?可以分成多个步骤。这是一个很好的直觉。然而,最佳的中断在很大程度上取决于基础数据、索引和连接路径。因此,很难从屏幕的另一面来说明这一点。我认为我链接的第二篇文章可以为您的具体案例提供一些关于优化的好建议。

如果没有一些统计数据,很难说如何改进它。这些桌子有多大?它们是否在连接键上建立索引?查询的SELECT部分在哪里?@Joe;我刚刚添加了整个代码,包括select语句。这些表的行数在75000到650000之间,列数在10到40之间。