Algorithm GDI+；_Algorithm_Graphics_Gdi+_Dropshadow

Algorithm GDI+；

algorithm graphics

Algorithm GDI+；,algorithm,graphics,gdi+,dropshadow,Algorithm,Graphics,Gdi+,Dropshadow,在GDI中，向图像添加阴影的有效方法是什么现在我从我的形象开始：我使用ImageAttributes和ColorMatrix将图像的alpha遮罩绘制到新图像： colorMatrix = ( ( 0, 0, 0, 0, 0), ( 0, 0, 0, 0, 0), ( 0, 0, 0, 0, 0), (-1, -1, -1, 1, 0), ( 1, 1, 1, 0, 1) ); 然后应用高斯模糊卷积核，并稍微偏移：然后

在GDI中，向图像添加阴影的有效方法是什么

现在我从我的形象开始：

我使用ImageAttributes和ColorMatrix将图像的alpha遮罩绘制到新图像：

colorMatrix = (
    ( 0,  0,  0, 0, 0),
    ( 0,  0,  0, 0, 0),
    ( 0,  0,  0, 0, 0),
    (-1, -1, -1, 1, 0),
    ( 1,  1,  1, 0, 1)
    );

然后应用高斯模糊卷积核，并稍微偏移：

然后我把我的原始图像画回到上面：

问题是它太慢了，生成带有阴影的图像需要大约170ms，而没有阴影的图像需要2ms（慢70x）：

带阴影：
```
171332µs
```
无阴影：
```
2457us
```

当用户（例如me）在项目列表中滚动时，额外的169ms延迟非常明显

您可以忽略下面的代码，它不会对问题或答案添加任何内容：

class function TImageEffects.GenerateDropShadow(image: TGPImage;
        const radius: Single; const OffsetX, OffsetY: Single; const Opacity: Single): TGPBitmap;
var
    width, height: Integer;
    alphaMask: TGPBitmap;
    shadow: TGPBitmap;
    graphics: TGPGraphics;
    imageAttributes: TGPImageAttributes;
    cm: TColorMatrix;
begin
{
    We generate a drop shadow by first getting the alpha mask. This will be a black
    sillouette on a transparent background. We then blur the black "shadow" by the amounts
    given.
    We then draw the original image on top of it's own shadow.
}

{
    http://msdn.microsoft.com/en-us/library/aa511280.aspx
    Windows Vista User Experience -> Guidelines -> Aesthetics -> Icons

    Basic Flat Icon Shadow Ranges

    Flat icons
    Flat icons are generally used for file icons and flat real-world objects,
    such as a document or a piece of paper.

    Flat icon lighting comes from the upper-left at 130 degrees.

    Smaller icons (for example, 16x16 and 32x32) are simplified for readability.
    However, if they contain a reflection within the icon (often simplified),
    they may have a tight drop shadow. The drop shadow ranges in opacity from
    30-50 percent.
    Layer effects can be used for flat icons, but should be compared with other
    flat icons. The shadows for objects will vary somewhat, according to what
    looks best and is most consistent within the size set and with the other
    icons in Windows Vista. On some occasions, it may even be necessary to
    modify the shadows. This will especially be true when objects are laid over
    others.
    A subtle range of colors may be used to achieve desired outcome. Shadows help
    objects sit in space. Color impacts the perceived weight of the shadow, and
    may distort the image if it is too heavy.

    Blend mode: Multiply
    Opacity: 22% to 50% - depends on color of the item.
    Angle: 130 to 120, use global light
    Distance: 3 (256 thru 48x), Distance = 1 (32x, 24x)
    Spread: 0
    Size: 7 (256x thru 48x), Spread = 2 (32x, 24x)
}
    width := image.GetWidth;
    height := image.GetHeight;

    //Get bitmap to hold final composited image and shadow
    Result := TGPBitmap.Create(width, height, PixelFormat32bppARGB);

    //Use ColorMatrix methods to "draw" the alpha image.
    alphaMask := TImageEffects.GetAlphaMask(image);
    try
        //Blur the black and white shadow image
//      shadow := TImageEffects.BoxBlur(alphaMask, radius);
        shadow := TImageEffects.GaussianBlur(alphaMask, radius); //because Gaussian Blur is linearly-separable into two 1d kernels, it's actually faster than the box blur
    finally
        alphaMask.Free;
    end;

    //Draw
    graphics := TGPGraphics.Create(Result);
    try
        //Draw the "shadow", using the passed in opacity value.
        {
            Color transformations are of the form
        c =  (r, g, b, a)
        c' = (r, g, b, a)
        c' = c*M
            = (r, g, b, a, 1) * (0 0 0 0 0)  //r
                                      (0 0 0 0 0)  //g
                                      (0 0 0 0 0)  //b
                                      (1 1 1 1 0)  //a
                                      (0 0 0 0 1)  //1
        }

        imageAttributes := TGPImageAttributes.Create;
    {   cm := (
                ( 1, 0, 0, 0,   0),
                ( 0, 1, 0, 0,   0),
                ( 0, 0, 1, 0,   0),
                ( 0, 0, 0, 0.5, 0),
                ( 0, 0, 0, 0,   1)
            );}
        cm[0, 0] :=  1; cm[0, 1] :=  0; cm[0, 2] :=  0; cm[0, 3] := 0;       cm[0, 4] := 0;
        cm[1, 0] :=  0; cm[1, 1] :=  1; cm[1, 2] :=  0; cm[1, 3] := 0;       cm[1, 4] := 0;
        cm[2, 0] :=  0; cm[2, 1] :=  0; cm[2, 2] :=  1; cm[2, 3] := 0;       cm[2, 4] := 0;
        cm[3, 0] :=  0; cm[3, 1] :=  0; cm[3, 2] :=  0; cm[3, 3] := Opacity; cm[3, 4] := 0;
        cm[4, 0] :=  0; cm[4, 1] :=  0; cm[4, 2] :=  0; cm[4, 3] := 0;       cm[4, 4] := 1;


        imageAttributes.SetColorMatrix(
                cm,
                ColorMatrixFlagsDefault,
                ColorAdjustTypeBitmap);
        try
            graphics.DrawImage(shadow,
                        MakeRectF(OffsetX, OffsetY, width, height), //destination rectangle
                        0, 0, //source (x,y)
                        width, height, //source width, height
                        UnitPixel,
                        ImageAttributes);

            //Draw original image over-top of it's shadow
            graphics.DrawImage(image, 0, 0);
        finally
            imageAttributes.Free;
        end;
    finally
        graphics.Free;
    end;
end;

它使用函数获取灰度alpha遮罩：

class function TImageEffects.GetAlphaMask(image: TGPImage): TGPBitmap;
var
    imageAttributes: TGPImageAttributes;
    cm: TColorMatrix;
    graphics: TGPGraphics;
    Width, Height: UINT;
begin
    {
        Color transformations are of the form
    c =  (r, g, b, a)
    c' = (r, g, b, a)
    c' = c*M
        = (r, g, b, a, 1) * (0 0 0 0 0)
                            (0 0 0 0 0)
                            (0 0 0 0 0)
                            (1 1 1 1 0)
                            (0 0 0 0 1)
    }

    imageAttributes := TGPImageAttributes.Create;

{   cm := (
            ( 0,  0,  0, 0, 0),
            ( 0,  0,  0, 0, 0),
            ( 0,  0,  0, 0, 0),
            (-1, -1, -1, 1, 0),
            ( 1,  1,  1, 0, 1)
        );}
    cm[0, 0] :=  0; cm[0, 1] :=  0; cm[0, 2] :=  0; cm[0, 3] := 0; cm[0, 4] := 0;
    cm[1, 0] :=  0; cm[1, 1] :=  0; cm[1, 2] :=  0; cm[1, 3] := 0; cm[1, 4] := 0;
    cm[2, 0] :=  0; cm[2, 1] :=  0; cm[2, 2] :=  0; cm[2, 3] := 0; cm[2, 4] := 0;
    cm[3, 0] := -1; cm[3, 1] := -1; cm[3, 2] := -1; cm[3, 3] := 1; cm[3, 4] := 0;
    cm[4, 0] :=  1; cm[4, 1] :=  1; cm[4, 2] :=  1; cm[4, 3] := 0; cm[4, 4] := 1;


    imageAttributes.SetColorMatrix(
            cm,
            ColorMatrixFlagsDefault,
            ColorAdjustTypeBitmap);

    width := image.GetWidth;
    height := image.GetHeight;

    Result := TGPBitmap.Create(Integer(width), Integer(height));
    graphics := TGPGraphics.Create(Result);
   try
        graphics.DrawImage(
                image,
                MakeRect(0, 0, width, height), //destination rectangle
             0, 0, //source (x,y)
             width, height,
             UnitPixel,
                ImageAttributes);
   finally
        graphics.Free;
    end;
end;

核心是高斯模糊：

class function TImageEffects.GaussianBlur(const bitmap: TGPBitmap;
  radius: Single): TGPBitmap;
var
    width, height: Integer;
    tempBitmap: TGPBitmap;
    bdSource: TBitmapData;
    bdTemp: TBitmapData;
    bdDest: TBitmapData;
    pSrc: PARGBArray;
    pTemp: PARGBArray;
    pDest: PARGBArray;
    stride: Integer;
    kernel: TKernel;
begin
//  kernel := MakeGaussianKernel2d(radius);
    kernel := MakeGaussianKernel1d(radius);
    try
//      Result := ConvolveBitmap(bitmap, kernel); brute 2d kernel

        width := bitmap.GetWidth;
        height := bitmap.GetHeight;

        // GDI+ still lies to us - the return format is BGR, NOT RGB.
        bitmap.LockBits(MakeRect(0, 0, width, height),
                ImageLockModeRead,
                PixelFormat32bppPARGB, bdSource);

        //intermediate bitmap
        tempBitmap := TGPBitmap.Create(width, height, PixelFormat32bppPARGB);
        tempBitmap.LockBits(MakeRect(0, 0, width, height),
                    ImageLockModeWrite,
                    PixelFormat32bppPARGB, bdTemp);

        //target bitmap
        Result := TGPBitmap.Create(width, height, PixelFormat32bppARGB);
        Result.LockBits(MakeRect(0, 0, width, height),
                    ImageLockModeWrite,
                    PixelFormat32bppPARGB, bdDest);

        pSrc := PARGBArray(bdSource.Scan0);
        pTemp := PARGBArray(bdTemp.Scan0);
        pDest := PARGBArray(bdDest.Scan0);
        stride := bdSource.Stride;

        ConvolveAndTranspose(kernel, pSrc^, pTemp^, width, height, stride, True, EdgeActionClampEdges);
        ConvolveAndTranspose(kernel, pTemp^, pDest^, height, width, stride, True, EdgeActionClampEdges);

        //Unlock source
       bitmap.UnlockBits(bdSource);
        tempBitmap.UnlockBits(bdTemp);
        Result.UnlockBits(bdDest);

        //get rid of temp
        tempBitmap.Free;
    finally
        kernel.Free;
    end;
end;

它需要一个1-D内核：

class function TImageEffects.MakeGaussianKernel1d(radius: Single): TKernel;
var
    r: Integer;
    rows: Integer;
    matrix: TSingleDynArray;
    sigma: Single;
    sigma22: Single;
    sigmaPi2: Single;
    sqrtSigmaPi2: Single;
    radius2: Single;
    total: Single;
    index: Integer;
    row: Integer;
    distance: Single;
    i: Integer;
begin
    r := Ceil(radius);
    rows := r*2+1;

    SetLength(matrix, rows);
    sigma := radius/3.0;
    sigma22 := 2*sigma*sigma;
    sigmaPi2 := 2*pi*sigma;
    sqrtSigmaPi2 := Sqrt(sigmaPi2);
    radius2 := radius*radius;
    total := 0;

    Index := 0;
    for row := -r to r do
    begin
        distance := row*row;
        if (distance > radius2) then
            matrix[index] := 0
        else
        begin
            matrix[index] := Exp((-distance)/sigma22) / sqrtSigmaPi2;
            total := total + matrix[index];
            Inc(index);
        end;
    end;

    //Normalize the values
    for i := 0 to rows-1 do
        matrix[i] := matrix[i] / total;


    Result := TKernel.Create(rows, 1, matrix);
end;

高斯函数的神奇之处在于它可以分为两个一维卷积：

class procedure TImageEffects.convolveAndTranspose(kernel: TKernel;
  const inPixels: array of ARGB; var outPixels: array of ARGB; width,
  height, stride: Integer; alpha: Boolean; edgeAction: TEdgeAction);
var
    index: Integer;
    matrix: TSingleDynArray;
    rows: Integer; //number of rows in the kernel
    cols: Integer; //number of columns in the kernel
    rows2: Integer; //half row count
    cols2: Integer; //half column count

    x, y: Integer; //
    r, g, b, a: Single; //summed red, green, blue, alpha values
    row, col: Integer;
    ix, iy, ioffset: Integer;
    moffset: Integer;
    f: Single;
    rgb: ARGB;
    ir, ig, ib, ia: Integer;

   function ClampPixel(value: Single): Integer;
    begin
        Result := Trunc(value+0.5);
        if Result < 0 then
            Result := 0
        else if Result > 255 then
            Result := 255;
    end;
begin
    matrix := kernel.KernelData;
    cols := kernel.Width;
    cols2 := cols div 2;

    for y := 0 to height-1 do
    begin
        index := y;
        ioffset := y*width;
        for x := 0 to width-1 do
        begin
            r := 0;
            g := 0;
            b := 0;
            a := 0;

            moffset := cols2;
            for col := -cols2 to cols2 do
            begin
                f := matrix[moffset+col];

                if (f <> 0) then
                begin
                    ix := x+col;
                    if ( ix < 0 ) then
                    begin
                        if ( edgeAction = EdgeActionClampEdges ) then
                            ix := 0
                        else if ( edgeAction = EdgeActionWrapEdges ) then
                            ix := (x+width) mod width;
                    end
                    else if ( ix >= width) then
                    begin
                        if ( edgeAction = EdgeActionClampEdges ) then
                            ix := width-1
                        else if ( edgeAction = EdgeActionWrapEdges ) then
                            ix := (x+width) mod width;
                    end;
                    rgb := inPixels[ioffset+ix];
                    a := a + f * ((rgb shr 24) and $FF);
                    r := r + f * ((rgb shr 16) and $FF);
                    g := g + f * ((rgb shr  8) and $FF);
                    b := b + f * ((rgb       ) and $FF);
                end;
            end;
            if alpha then
                ia := ClampPixel(a)
         else
                ia := $FF;
            ir := ClampPixel(r);
            ig := ClampPixel(g);
            ib := ClampPixel(b);
            outPixels[index] := MakeARGB(ia, ir, ig, ib);

            Inc(index, height);
        end;
    end;
end;

分析表明，88.62%的时间花在以下方面：

a := a + f * ((rgb shr 24) and $FF);
r := r + f * ((rgb shr 16) and $FF);
g := g + f * ((rgb shr  8) and $FF);
b := b + f * ((rgb       ) and $FF);

这是每像素alpha混合

这让我觉得有一种更好的方法来应用软阴影，即应用模糊效果，毕竟Windows和OSX实时地将阴影应用于Windows。

我知道逐像素操作相当慢，但从来没有这样做过；70倍似乎比我想象的要多。可能是因为您使用的是托管语言，这是VM开销最大化的一种情况。您是否尝试过用本机代码制作程序的这一部分？此链接有一个本机实现，可用于快速测试：

不幸的是，它们唯一的区别是使用了一种可以生成本机代码的语言，但它们仍然使用双层循环来访问像素。例如，如果你可以假设运行应用程序的机器有这样的硬件，那么最好使用CUDA。但在这种情况下，您将不再使用GDI+。不管怎样，也许这个问题有帮助：

我之所以要代码，是想看看您是使用了“快速位图”方法还是

GetPixel（），SetPixel（）

方法

既然您已经介绍了这一点，我怀疑您是否能够在性能优化方面做得更多。GDI+并不是为这种每像素操作场景而设计的。实事求是，你应该考虑实现一个更简单的阴影生成器，它看起来不那么奇特，但不会像处理器密集型。这在很大程度上取决于您的使用场景（您没有真正描述）：

这些图片都相似吗（所有的票还是你把票当作样本使用）？如果是，则可以生成一次阴影并重用该位图
当用户在做其他事情时，您可以生成并缓存图像的阴影版本（或只是阴影缩略图），作为后台进程

您还可以在Paint.NET中尝试高斯模糊（它对大多数内容使用GDI+），并在那里测量它的速度。我怀疑您是否能够使它比Paint.NET更快，因此它是一个很好的基准。

该算法来自以下博客条目：。当然，它正在实现最新也是最快的版本。：-）

它直接从我的工作代码中复制，因此外部假设可能反映了这一点。该函数返回一个更大的位图以适应大小的增加。当然，在代码中，您需要相应地处理这个问题。它采用32位alpha图片，但可以轻松修改为仅处理24位（

通道

常量和

像素格式

值）

它的执行时间与位图本身相比可以忽略不计，但仍然…

< P>如果你是纯性能的，你也可以只考虑源图像的矩形矩形条带。这样你就不用花时间卷积图像的中心（隐藏）部分，而只需要有机会在屏幕上绘画的部分。

我测试了一些算法，最好的是Gábor实现的高斯模糊。在我的测试中，该算法的延迟约为20ms

以下是在Delphi中实现it算法的一些更改（它使用免费软件Bilsen GDI+lib）：

向我们展示代码在回答问题时会有很大帮助，但不能真正展示全部代码；它包括一个二维高斯模糊，它本身就是三个屏幕。我只是担心这对很多人来说都是一堆东西。知道算法是可靠的是很有用的。Guassian模糊比长方体模糊或三角形模糊快，因为Guassian模糊是线性可分离的，并且可转换为两个一维卷积。@Igor Brejc:采样表明每像素乘法的速度减慢。我希望有一种与我使用的算法完全不同的算法（即，将alpha转换为灰度、模糊、在顶部绘制原始图像）您是否考虑过掩盖票据下方的（相当大的区域），在那里您不必计算阴影？您应该使用指针和不安全的代码，这将大大加快它的速度。我不在托管代码中：它是GDI+（即原生代码）。另外，速度慢70倍的是执行每像素乘法，而不是什么都不做（没有像素效果）。很抱歉，因为您的代码是Delphi的，我假设它是Delphi for。关于每像素操作，我相信问题部分也是由于频繁访问非连续内存位置造成的，强制处理器每次刷新其缓存。也许重新安排会有所帮助。GDI+是一个本机API，无论您如何分割它。NET有封装非托管平面C GDI+API的托管类。我理解，但这是否意味着当你得到像素ma的基准时

a := a + f * ((rgb shr 24) and $FF);
r := r + f * ((rgb shr 16) and $FF);
g := g + f * ((rgb shr  8) and $FF);
b := b + f * ((rgb       ) and $FF);

public static class DropShadow {
  const int CHANNELS = 4;

  public static Bitmap CreateShadow(Bitmap bitmap, int radius, float opacity) {
    // Alpha mask with opacity
    var matrix = new ColorMatrix(new float[][] {
            new float[] {  0F,  0F,  0F, 0F,      0F }, 
            new float[] {  0F,  0F,  0F, 0F,      0F }, 
            new float[] {  0F,  0F,  0F, 0F,      0F }, 
            new float[] { -1F, -1F, -1F, opacity, 0F },
            new float[] {  1F,  1F,  1F, 0F,      1F }
        });

    var imageAttributes = new ImageAttributes();
    imageAttributes.SetColorMatrix(matrix, ColorMatrixFlag.Default, ColorAdjustType.Bitmap);
    var shadow = new Bitmap(bitmap.Width + 4 * radius, bitmap.Height + 4 * radius);
    using (var graphics = Graphics.FromImage(shadow))
      graphics.DrawImage(bitmap, new Rectangle(2 * radius, 2 * radius, bitmap.Width, bitmap.Height), 0, 0, bitmap.Width, bitmap.Height, GraphicsUnit.Pixel, imageAttributes);

    // Gaussian blur
    var clone = shadow.Clone() as Bitmap;
    var shadowData = shadow.LockBits(new Rectangle(0, 0, shadow.Width, shadow.Height), ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
    var cloneData = clone.LockBits(new Rectangle(0, 0, clone.Width, clone.Height), ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);

    var boxes = DetermineBoxes(radius, 3);
    BoxBlur(shadowData, cloneData, shadow.Width, shadow.Height, (boxes[0] - 1) / 2);
    BoxBlur(shadowData, cloneData, shadow.Width, shadow.Height, (boxes[1] - 1) / 2);
    BoxBlur(shadowData, cloneData, shadow.Width, shadow.Height, (boxes[2] - 1) / 2);

    shadow.UnlockBits(shadowData);
    clone.UnlockBits(cloneData);
    return shadow;
  }

  private static unsafe void BoxBlur(BitmapData data1, BitmapData data2, int width, int height, int radius) {
    byte* p1 = (byte*)(void*)data1.Scan0;
    byte* p2 = (byte*)(void*)data2.Scan0;

    int radius2 = 2 * radius + 1;
    int[] sum = new int[CHANNELS];
    int[] FirstValue = new int[CHANNELS];
    int[] LastValue = new int[CHANNELS];

    // Horizontal
    int stride = data1.Stride;
    for (var row = 0; row < height; row++) {
      int start = row * stride;
      int left = start;
      int right = start + radius * CHANNELS;

      for (int channel = 0; channel < CHANNELS; channel++) {
        FirstValue[channel] = p1[start + channel];
        LastValue[channel] = p1[start + (width - 1) * CHANNELS + channel];
        sum[channel] = (radius + 1) * FirstValue[channel];
      }
      for (var column = 0; column < radius; column++)
        for (int channel = 0; channel < CHANNELS; channel++)
          sum[channel] += p1[start + column * CHANNELS + channel];
      for (var column = 0; column <= radius; column++, right += CHANNELS, start += CHANNELS)
        for (int channel = 0; channel < CHANNELS; channel++) {
          sum[channel] += p1[right + channel] - FirstValue[channel];
          p2[start + channel] = (byte)(sum[channel] / radius2);
        }
      for (var column = radius + 1; column < width - radius; column++, left += CHANNELS, right += CHANNELS, start += CHANNELS)
        for (int channel = 0; channel < CHANNELS; channel++) {
          sum[channel] += p1[right + channel] - p1[left + channel];
          p2[start + channel] = (byte)(sum[channel] / radius2);
        }
      for (var column = width - radius; column < width; column++, left += CHANNELS, start += CHANNELS)
        for (int channel = 0; channel < CHANNELS; channel++) {
          sum[channel] += LastValue[channel] - p1[left + channel];
          p2[start + channel] = (byte)(sum[channel] / radius2);
        }
    }

    // Vertical
    stride = data2.Stride;
    for (int column = 0; column < width; column++) {
      int start = column * CHANNELS;
      int top = start;
      int bottom = start + radius * stride;

      for (int channel = 0; channel < CHANNELS; channel++) {
        FirstValue[channel] = p2[start + channel];
        LastValue[channel] = p2[start + (height - 1) * stride + channel];
        sum[channel] = (radius + 1) * FirstValue[channel];
      }
      for (int row = 0; row < radius; row++)
        for (int channel = 0; channel < CHANNELS; channel++)
          sum[channel] += p2[start + row * stride + channel];
      for (int row = 0; row <= radius; row++, bottom += stride, start += stride)
        for (int channel = 0; channel < CHANNELS; channel++) {
          sum[channel] += p2[bottom + channel] - FirstValue[channel];
          p1[start + channel] = (byte)(sum[channel] / radius2);
        }
      for (int row = radius + 1; row < height - radius; row++, top += stride, bottom += stride, start += stride)
        for (int channel = 0; channel < CHANNELS; channel++) {
          sum[channel] += p2[bottom + channel] - p2[top + channel];
          p1[start + channel] = (byte)(sum[channel] / radius2);
        }
      for (int row = height - radius; row < height; row++, top += stride, start += stride)
        for (int channel = 0; channel < CHANNELS; channel++) {
          sum[channel] += LastValue[channel] - p2[top + channel];
          p1[start + channel] = (byte)(sum[channel] / radius2);
        }
    }
  }

  private static int[] DetermineBoxes(double Sigma, int BoxCount) {
    double IdealWidth = Math.Sqrt((12 * Sigma * Sigma / BoxCount) + 1);
    int Lower = (int)Math.Floor(IdealWidth);
    if (Lower % 2 == 0)
      Lower--;
    int Upper = Lower + 2;

    double MedianWidth = (12 * Sigma * Sigma - BoxCount * Lower * Lower - 4 * BoxCount * Lower - 3 * BoxCount) / (-4 * Lower - 4);
    int Median = (int)Math.Round(MedianWidth);

    int[] BoxSizes = new int[BoxCount];
    for (int i = 0; i < BoxCount; i++)
      BoxSizes[i] = (i < Median) ? Lower : Upper;
    return BoxSizes;
  }

}

BoxBlur(shadowData, cloneData, shadow.Width, shadow.Height, radius - 1);
BoxBlur(shadowData, cloneData, shadow.Width, shadow.Height, radius - 1);
BoxBlur(shadowData, cloneData, shadow.Width, shadow.Height, radius);

function CreateBlurShadow(ABitmap: IGPBitmap; ARadius: Integer; AOpacity: Double; AColor: TColor = clNone): IGPBitmap;

  procedure BoxBlur(const AData1, AData2: TGPBitmapData; AWidth, AHeight, ARadius: Integer);
  const
    CHANNELS = 4;
  var
    LScan1, LScan2: PByte;
    LSum, LFirstValue, LLastValue: array [0..CHANNELS-1] of Integer;
    LRadius2, LStride, LStart, LChannel, LLeft, LRight, LBottom, LTop, LRow, LColumn: Integer;
  begin
    LScan1 := AData1.Scan0;
    LScan2 := AData2.Scan0;
    LRadius2 := (2 * ARadius) + 1;
    LStride := AData1.Stride;
    for LRow := 0 to AHeight-1 do
    begin
      LStart := LRow * LStride;
      LLeft := LStart;
      LRight := LStart + ARadius * CHANNELS;
      for LChannel := 0 to CHANNELS-1 do
      begin
        LFirstValue[LChannel] := LScan1[LStart + LChannel];
        LLastValue[LChannel] := LScan1[LStart + ((AWidth - 1) * CHANNELS) + LChannel];
        LSum[LChannel] := (ARadius + 1) * LFirstValue[LChannel];
      end;
      for LColumn := 0 to ARadius-1 do
        for LChannel := 0 to CHANNELS-1 do
          LSum[LChannel] := LSum[LChannel] + LScan1[LStart + (LColumn * CHANNELS) + LChannel];
      for LColumn := 0 to ARadius do
      begin
        for LChannel := 0 to CHANNELS-1 do
        begin
          LSum[LChannel] := LSum[LChannel] + LScan1[LRight + LChannel] - LFirstValue[LChannel];
          LScan2[LStart + LChannel] := Byte(LSum[LChannel] div LRadius2);
        end;
        Inc(LRight, CHANNELS);
        Inc(LStart, CHANNELS);
      end;
      for LColumn := ARadius + 1 to AWidth-ARadius-1 do
      begin
        for LChannel := 0 to CHANNELS-1 do
        begin
          LSum[LChannel] := LSum[LChannel] + LScan1[LRight + LChannel] - LScan1[LLeft + LChannel];
          LScan2[LStart + LChannel] := Byte(LSum[LChannel] div LRadius2);
        end;
        Inc(LLeft, CHANNELS);
        Inc(LRight, CHANNELS);
        Inc(LStart, CHANNELS);
      end;
      for LColumn := AWidth-ARadius to AWidth-1 do
      begin
        for LChannel := 0 to CHANNELS-1 do
        begin
          LSum[LChannel] := LSum[LChannel] + LLastValue[LChannel] - LScan1[LLeft + LChannel];
          LScan2[LStart + LChannel] := Byte(LSum[LChannel] div LRadius2);
        end;
        Inc(LLeft, CHANNELS);
        Inc(LStart, CHANNELS);
      end;
    end;
    LStride := AData2.Stride;
    for LColumn := 0 to AWidth-1 do
    begin
      LStart := LColumn * CHANNELS;
      LTop := LStart;
      LBottom := LStart + (ARadius * LStride);
      for LChannel := 0 to CHANNELS-1 do
      begin
        LFirstValue[LChannel] := LScan2[LStart + LChannel];
        LLastValue[LChannel] := LScan2[LStart + ((AHeight - 1) * LStride) + LChannel];
        LSum[LChannel] := (ARadius + 1) * LFirstValue[LChannel];
      end;
      for LRow := 0 to ARadius-1 do
        for LChannel := 0 to CHANNELS-1 do
          LSum[LChannel] := LSum[LChannel] + LScan2[LStart + (LRow * LStride) + LChannel];
      for LRow := 0 to ARadius do
      begin
        for LChannel := 0 to CHANNELS-1 do
        begin
          LSum[LChannel] := LSum[LChannel] + LScan2[LBottom + LChannel] - LFirstValue[LChannel];
          LScan1[LStart + LChannel] := Byte(LSum[LChannel] div LRadius2);
        end;
        Inc(LBottom, LStride);
        Inc(LStart, LStride);
      end;
      for LRow := ARadius + 1 to AHeight - ARadius - 1 do
      begin
        for LChannel := 0 to CHANNELS-1 do
        begin
          LSum[LChannel] := LSum[LChannel] + LScan2[LBottom + LChannel] - LScan2[LTop + LChannel];
          LScan1[LStart + LChannel] := Byte(LSum[LChannel] div LRadius2);
        end;
        Inc(LTop, LStride);
        Inc(LBottom, LStride);
        Inc(LStart, LStride);
      end;
      for LRow := AHeight - ARadius to AHeight-1 do
      begin
        for LChannel := 0 to CHANNELS-1 do
        begin
          LSum[LChannel] := LSum[LChannel] + LLastValue[LChannel] - LScan2[LTop + LChannel];
          LScan1[LStart + LChannel] := Byte(LSum[LChannel] div LRadius2);
        end;
        Inc(LTop, LStride);
        Inc(LStart, LStride);
      end;
    end;
  end;

const
  INITIAL_MATRIX: array [0..4, 0..4] of Single =
   ((0.5,   0,   0, 0, 0),
    (0,   0.5,   0, 0, 0),
    (0,     0, 0.5, 0, 0),
    (0,     0,   0, 1, 0),
    (0,     0,   0, 0, 1));
var
  LMatrix: TGPColorMatrix;
  LImageAttributes: IGPImageAttributes;
  LShadow, LClone: IGPBitmap;
  LGraphics: IGPGraphics;
  LShadowData, LCloneData: TGPBitmapData;
  LColor: TGPColor;
begin
  ARadius := Max(ARadius, 0);
  LShadow := TGPBitmap.Create(ABitmap.Width + (4 * Cardinal(ARadius)),
    ABitmap.Height + (4 * Cardinal(ARadius)), PixelFormat32bppARGB);
  LGraphics := TGPGraphics.FromImage(LShadow);
  LGraphics.DrawImage(ABitmap, TGPRect.Create(2 * ARadius, 2 * ARadius,
    ABitmap.Width, ABitmap.Height), 0, 0, ABitmap.Width, ABitmap.Height,
    TGPUnit.UnitPixel);
  LClone := LShadow.Clone;
  LShadowData := LShadow.LockBits(TGPRect.Create(0, 0, LShadow.Width, LShadow.Height),
    [ImageLockModeRead, ImageLockModeWrite], PixelFormat32bppARGB);
  LCloneData := LClone.LockBits(TGPRect.Create(0, 0, LClone.Width, LClone.Height),
    [ImageLockModeRead, ImageLockModeWrite], PixelFormat32bppARGB);
  try
    BoxBlur(LShadowData, LCloneData, LShadow.Width, LShadow.Height, ARadius - 1);
    BoxBlur(LShadowData, LCloneData, LShadow.Width, LShadow.Height, ARadius - 1);
    BoxBlur(LShadowData, LCloneData, LShadow.Width, LShadow.Height, ARadius);
  finally
    LShadow.UnlockBits(LShadowData);
    LClone.UnlockBits(LCloneData);
  end;
  if (AColor = clNone) and (AOpacity = 1.0) then
    Result := LShadow
  else
  begin
    LColor := TGPColor.CreateFromColorRef(ColorToRGB(AColor));
    Move(INITIAL_MATRIX[0, 0], LMatrix.M[0, 0], SizeOf(INITIAL_MATRIX));
    LMatrix.M[4, 0] := Min((Integer(LColor.R) - 127) / 127, 1.0);
    LMatrix.M[4, 1] := Min((Integer(LColor.G) - 127) / 127, 1.0);
    LMatrix.M[4, 2] := Min((Integer(LColor.B) - 127) / 127, 1.0);
    LMatrix.M[4, 3] := AOpacity-1;
    LImageAttributes := TGPImageAttributes.Create;
    LImageAttributes.SetColorMatrix(LMatrix, TGPColorMatrixFlags.ColorMatrixFlagsDefault,
      TGPColorAdjustType.ColorAdjustTypeBitmap);
    Result := TGPBitmap.Create(LShadow.Width, LShadow.Height, PixelFormat32bppARGB);
    LGraphics := TGPGraphics.FromImage(Result);
    LGraphics.DrawImage(LShadow, TGPRect.Create(0, 0, LShadow.Width, LShadow.Height),
      0, 0, Result.Width, Result.Height, TGPUnit.UnitPixel, LImageAttributes);
  end;
end;