U-SQL生成错误,equijoin具有不同的类型

U-SQL生成错误,equijoin具有不同的类型,sql,analytics,azure-data-lake,u-sql,Sql,Analytics,Azure Data Lake,U Sql,我正在尝试创建一个USQL作业,并从CSV中定义了我的列,这些列将从中检索,但是我在连接部分总是遇到问题,因为我匹配的列属于不同的类型。这很奇怪,因为我将它们定义为同一类型。请参见问题所在的屏幕截图: 以下是完整的USQL: @guestCheck = EXTRACT GuestCheckID int, POSCheckGUID Guid, POSCheckNumber int?, OwnerEmployeeID

我正在尝试创建一个USQL作业,并从CSV中定义了我的列,这些列将从中检索,但是我在连接部分总是遇到问题,因为我匹配的列属于不同的类型。这很奇怪,因为我将它们定义为同一类型。请参见问题所在的屏幕截图:

以下是完整的USQL:

@guestCheck = 
    EXTRACT GuestCheckID int,
            POSCheckGUID Guid,
            POSCheckNumber int?,
            OwnerEmployeeID int,
            CreatedDateTime DateTime?,
            ClosedDateTime DateTime?,
            TicketReference string,
            CheckAmount decimal?,
            POSTerminalID int,
            CheckState string,
            LocationID int?,
            TableID int?,
            Covers int?,
            PostedDateTime DateTime?,
            OrderChannelID int?,
            MealPeriodID int?,
            RVCLocationID int?,
            ReopenedTerminalID int?,
            ReopenedEmployeeID int?,
            ReopenedDateTime DateTime?,
            ClosedBusDate int?,
            PostedBusDate int?,
            BusHour byte?,
            TaxExempt bool?,
            TaxExemptReference string
    FROM "/GuestCheck/GuestCheck-incomplete.csv"
    USING Extractors.Csv();

@guestCheckAncillaryAmount =
    EXTRACT CheckAncillaryAmountID int,
            GuestCheckID int,
            GuestCheckItemID int?,
            AncillaryAmountTypeID int,
            Amount decimal,
            FirstDetail int?,
            LastDetail int?,
            IsReturn bool?,
            ReturnReasonID int?,
            AncillaryReasonID int?,
            AncillaryNote string,
            ClosedBusDate int?,
            PostedBusDate int?,
            BusHour byte?,
            LocationID int?,
            RVCLocationID int?,
            IsDelisted bool?,
            Exempted bool?
    FROM "/GuestCheck/GuestCheckAncillaryAmount.csv"
    USING Extractors.Csv();

@ancillaryAmountType = 
    EXTRACT AncillaryAmountTypeID int,
            AncillaryAmountCategoryID int,
            CustomerID int,
            CheckTitle string,
            ReportTitle string,
            Percentage decimal,
            FixedAmount decimal,
            IncludeOnCheck bool,
            AutoCalculate bool,
            StoreAtCheckLevel bool?,
            DateTimeModified DateTime?,
            CheckTitleToken Guid?,
            ReportTitleToken Guid?,
            DeletedFlag bool,
            MaxUsageQty int?,
            ApplyToBasePriceOnly bool?,
            Exclusive bool,
            IsItem bool,
            MinValue decimal,
            MaxValue decimal,
            ItemGroupID int?,
            LocationID int,
            ApplicationOrder int?,
            RequiresReason bool,
            Exemptable bool?
    FROM "/GuestCheck/AncillaryAmountType.csv"
    USING Extractors.Csv();

@read =
    SELECT t.POSCheckGUID,
           t.POSCheckNumber,
           t.CheckAmount,
           aat.AncillaryAmountTypeID,
           aat.CheckTitle,
           gcd.Amount
    FROM @guestCheck AS t         
         LEFT JOIN
             @guestCheckAncillaryAmount AS gcd
         ON t.GuestCheckID == gcd.GuestCheckID
         LEFT JOIN
             @ancillaryAmountType AS aat
         ON gcd.AncillaryAmountTypeID == aat.AncillaryAmountTypeID
    WHERE aat.AncillaryAmountCategoryID IN(2, 4, 8);

OUTPUT @read
TO "/GuestCheckOutput/output.csv"
USING Outputters.Csv();

实际上,U-SQL是强类型的,
int
int?
是不同的类型。您需要在中间行集中强制转换:

@ancillaryAmountType2 =
SELECT (int?) aat.AncillaryAmountTypeID AS AncillaryAmountTypeID,
       aat.AncillaryAmountCategoryID,
       aat.CheckTitle
FROM @ancillaryAmountType AS aat;

或者,更好的做法是使用维度建模最佳实践,并出于中所述的原因避免可为空的“维度”。

这与
EXTRACT
表定义中指定的列的可为空性无关,因为OP在其代码中显示,两个联接列都未指定为空(即使用
)在
提取定义中。这与多个外部联接以及所谓的空提供表有关

如果从逻辑上考虑,假设有三个表,TableA有三条记录,TableB有两条记录,TableC有一条记录,类似这样:

如果您从tableA开始,并将
左外连接到tableB,您本能地知道您将获得三条记录,但tableB列x的列x将为空;这是您的空供应表,也是空性的来源

谢天谢地,修复是相同的;在前面更改列的可空性或指定替换值,例如-1

@t3 =
    SELECT (int?) x AS x, 2 AS a
    FROM dbo.tmpC;

// OR

// Use conditional operator to supply substitute values
@t3 =
    SELECT x == null ? -1 : x AS x, 2 AS a
    FROM dbo.tmpC;
但是,您的特定查询还有另一个问题。在大多数关系数据库中,将
WHERE
子句添加到
左侧外部联接右侧的表中
会将联接转换为
内部联接
,这在U-SQL中也是一样的。您可能需要考虑您试图获得的实际结果,然后进行协作nsider正在重写您的查询


HTH

感谢Alexandre的回答。然而,GuestcheckAncillaryAmountType和ancillaryAmountType中的AncillaryAmountTypeID都被定义为int。那么为什么说一个是int呢?我从来没有这样说过。