C# 将缓存控制和过期标头添加到Azure存储Blob

C# 将缓存控制和过期标头添加到Azure存储Blob,c#,azure,azure-storage-blobs,cache-control,C#,Azure,Azure Storage Blobs,Cache Control,我正在使用Azure存储来提供静态文件blob,但我想在提供时为文件/blob添加一个缓存控制和Expires头,以降低带宽成本 应用程序like和Cerburata提供了在容器和blob上设置元数据属性的选项,但在尝试添加缓存控制时会感到不安 有人知道是否可以为文件设置这些标题吗?我不得不在大约60万个blob上运行批处理作业,并发现了两件真正有用的事情: 从同一数据中心中的工作人员角色运行操作。Azure服务之间的速度非常快,只要它们位于同一个关联组中。此外,没有数据传输成本 并行运行该操作

我正在使用Azure存储来提供静态文件blob,但我想在提供时为文件/blob添加一个缓存控制和Expires头,以降低带宽成本

应用程序like和Cerburata提供了在容器和blob上设置元数据属性的选项,但在尝试添加缓存控制时会感到不安


有人知道是否可以为文件设置这些标题吗?

我不得不在大约60万个blob上运行批处理作业,并发现了两件真正有用的事情:

  • 从同一数据中心中的工作人员角色运行操作。Azure服务之间的速度非常快,只要它们位于同一个关联组中。此外,没有数据传输成本
  • 并行运行该操作。NETV4中的任务并行库(TPL)使这非常容易。以下是为容器中的每个blob并行设置缓存控制标头的代码:

    // get the info for every blob in the container
    var blobInfos = cloudBlobContainer.ListBlobs(
        new BlobRequestOptions() { UseFlatBlobListing = true });
    Parallel.ForEach(blobInfos, (blobInfo) =>
    {
        // get the blob properties
        CloudBlob blob = container.GetBlobReference(blobInfo.Uri.ToString());
        blob.FetchAttributes();
    
        // set cache-control header if necessary
        if (blob.Properties.CacheControl != YOUR_CACHE_CONTROL_HEADER)
        {
            blob.Properties.CacheControl = YOUR_CACHE_CONTROL_HEADER;
            blob.SetProperties();
        }
    });
    
  • v2011.04.23.00的最新版本支持在单个blob对象上设置缓存控制。右键单击blob对象,选择“查看/编辑blob属性”,然后设置
    缓存控件
    属性的值。(例如,
    公共,最大年龄=2592000

    如果使用curl检查blob对象的HTTP头,您将看到返回的缓存控制头和您设置的值。

    最新版本现在支持缓存控制:
    回答这个问题可能太晚了,但最近我想以不同的方式做同样的事情,我有一个图像列表,需要使用powershell脚本应用(当然是在Azure存储组件的帮助下) 希望将来有人会觉得这很有用

    完整的解释在


    以下是Joel Fillmore的答案的更新版本,使用Azure.Storage.Blobs的Net 5和V12。(旁白:如果可以在父容器上设置默认的头属性不是很好吗?)

    Azure没有创建网站和使用WorkerRole,而是能够运行“WebJobs”。您可以在存储帐户所在的同一数据中心的网站上按需运行任何可执行文件,以设置缓存标头或任何其他标头字段

  • 在存储帐户所在的数据中心创建一个一次性临时网站。不要担心亲和力组;创建空的ASP.NET站点或任何其他简单站点。内容不重要。我至少需要使用B1服务计划,否则WebJob会在5分钟后中止
  • 使用下面的代码创建一个控制台程序,该程序与更新的Azure存储API一起工作。编译它以供发布,然后将可执行文件和所有必需的DLL压缩到.zip文件中,或者从VisualStudio发布它并跳过下面的第3步
  • 创建一个WebJob并从步骤2中上载.zip文件
  • 运行WebJob。写入控制台的所有内容都可以在创建的日志文件中查看,并可以从WebJob控制页面访问。
  • 删除临时网站,或将其更改为免费层(在“放大”下)
  • 下面的代码为每个容器运行一个单独的任务,我每分钟更新多达100K个标题(取决于一天中的时间?)。没有出境费

    using Azure;
    using Azure.Storage.Blobs;
    using Azure.Storage.Blobs.Models;
    using System;
    using System.Collections.Generic;
    using System.Threading.Tasks;
    
    namespace AzureHeaders
    {
        class Program
        {
            private static string connectionString = "DefaultEndpointsProtocol=https;AccountName=REPLACE_WITH_YOUR_CONNECTION_STRING";
            private static string newCacheControl = "public, max-age=7776001"; // 3 months
            private static string[] containersToProcess = { "container1", "container2" };
    
            static async Task Main(string[] args)
            {
                BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);
    
                var tasks = new List<Task>();
                foreach (var container in containersToProcess)
                {
                    BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(container);
                    tasks.Add(Task.Run(() => UpdateHeaders(containerClient, 1000)));  // I have no idea what segmentSize should be!
                }
                Task.WaitAll(tasks.ToArray());
            }
    
            private static async Task UpdateHeaders(BlobContainerClient blobContainerClient, int? segmentSize)
            {
                int processed = 0;
                int failed = 0;
                try
                {
                    // Call the listing operation and return pages of the specified size.
                    var resultSegment = blobContainerClient.GetBlobsAsync()
                        .AsPages(default, segmentSize);
    
                    // Enumerate the blobs returned for each page.
                    await foreach (Azure.Page<BlobItem> blobPage in resultSegment)
                    {
                        var tasks = new List<Task>();
    
                        foreach (BlobItem blobItem in blobPage.Values)
                        {
                            BlobClient blobClient = blobContainerClient.GetBlobClient(blobItem.Name);
                            tasks.Add(UpdateOneBlob(blobClient));
                            processed++;
                        }
                        Task.WaitAll(tasks.ToArray());
                        Console.WriteLine($"Container {blobContainerClient.Name} processed: {processed}");
                    }
                }
                catch (RequestFailedException e)
                {
                    Console.WriteLine(e.Message);
                    failed++;
                }
                Console.WriteLine($"Container {blobContainerClient.Name} processed: {processed}, failed: {failed}");
            }
    
            private static async Task UpdateOneBlob(BlobClient blobClient) {
                Response<BlobProperties> propertiesResponse = await blobClient.GetPropertiesAsync();
                BlobHttpHeaders httpHeaders = new BlobHttpHeaders
                {
                    // copy any existing headers you wish to preserve
                    ContentType = propertiesResponse.Value.ContentType,
                    ContentHash = propertiesResponse.Value.ContentHash,
                    ContentEncoding = propertiesResponse.Value.ContentEncoding,
                    ContentDisposition = propertiesResponse.Value.ContentDisposition,
                    // update CacheControl
                    CacheControl = newCacheControl  
                };
                await blobClient.SetHttpHeadersAsync(httpHeaders);
            }
        }
    }
    
    使用Azure;
    使用Azure.Storage.Blobs;
    使用Azure.Storage.Blobs.Models;
    使用制度;
    使用System.Collections.Generic;
    使用System.Threading.Tasks;
    名称空间AzureHeaders
    {
    班级计划
    {
    私有静态字符串connectionString=“DefaultEndpointsProtocol=https;AccountName=REPLACE\u WITH\u YOUR\u CONNECTION\u string”;
    私有静态字符串newCacheControl=“public,最大年龄=7776001”//3个月
    私有静态字符串[]containersToProcess={“container1”、“container2”};
    静态异步任务主(字符串[]args)
    {
    BlobServiceClient BlobServiceClient=新BlobServiceClient(connectionString);
    var tasks=新列表();
    foreach(containersToProcess中的var容器)
    {
    BlobContainerClient containerClient=blobServiceClient.GetBlobContainerClient(容器);
    tasks.Add(Task.Run(()=>UpdateHeaders(containerClient,1000));//我不知道分段大小应该是多少!
    }
    Task.WaitAll(tasks.ToArray());
    }
    私有静态异步任务更新头(BlobContainerClient BlobContainerClient,int?segmentSize)
    {
    int=0;
    int失败=0;
    尝试
    {
    //调用列表操作并返回指定大小的页面。
    var resultSegment=blobContainerClient.GetBlobsAsync()
    .AsPages(默认值,分段大小);
    //枚举为每个页面返回的blob。
    等待foreach(resultSegment中的Azure.Page blobPage)
    {
    var tasks=新列表();
    foreach(blobPage.Values中的BlobItem-BlobItem)
    {
    BlobClient BlobClient=blobContainerClient.GetBlobClient(blobItem.Name);
    添加(updateNodeBlob(blobClient));
    处理++;
    }
    Task.WaitAll(tasks.ToArray());
    WriteLine($“容器{blobContainerClient.Name}已处理:{processed}”);
    }
    }
    捕获(请求失败异常e)
    {
    控制台写入线(e.Message);
    失败++;
    }
    WriteLine($“容器{blobContainerClient.Name}已处理:{processed},失败:{failed}”);
    }
    私有静态异步任务更新eBlob(BlobClient BlobClient){
    Response propertiesResponse=等待blobClient.GetPropertiesAsync();
    BlobHttpHeaders httpHeaders=新BlobHttpHeaders
    {
    //复制要保留的任何现有标题
    续
    
    using Azure;
    using Azure.Storage.Blobs;
    using Azure.Storage.Blobs.Models;
    using System;
    using System.Collections.Generic;
    using System.Threading.Tasks;
    
    namespace AzureHeaders
    {
        class Program
        {
            private static string connectionString = "DefaultEndpointsProtocol=https;AccountName=REPLACE_WITH_YOUR_CONNECTION_STRING";
            private static string newCacheControl = "public, max-age=7776001"; // 3 months
            private static string[] containersToProcess = { "container1", "container2" };
    
            static async Task Main(string[] args)
            {
                BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);
    
                var tasks = new List<Task>();
                foreach (var container in containersToProcess)
                {
                    BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(container);
                    tasks.Add(Task.Run(() => UpdateHeaders(containerClient, 1000)));  // I have no idea what segmentSize should be!
                }
                Task.WaitAll(tasks.ToArray());
            }
    
            private static async Task UpdateHeaders(BlobContainerClient blobContainerClient, int? segmentSize)
            {
                int processed = 0;
                int failed = 0;
                try
                {
                    // Call the listing operation and return pages of the specified size.
                    var resultSegment = blobContainerClient.GetBlobsAsync()
                        .AsPages(default, segmentSize);
    
                    // Enumerate the blobs returned for each page.
                    await foreach (Azure.Page<BlobItem> blobPage in resultSegment)
                    {
                        var tasks = new List<Task>();
    
                        foreach (BlobItem blobItem in blobPage.Values)
                        {
                            BlobClient blobClient = blobContainerClient.GetBlobClient(blobItem.Name);
                            tasks.Add(UpdateOneBlob(blobClient));
                            processed++;
                        }
                        Task.WaitAll(tasks.ToArray());
                        Console.WriteLine($"Container {blobContainerClient.Name} processed: {processed}");
                    }
                }
                catch (RequestFailedException e)
                {
                    Console.WriteLine(e.Message);
                    failed++;
                }
                Console.WriteLine($"Container {blobContainerClient.Name} processed: {processed}, failed: {failed}");
            }
    
            private static async Task UpdateOneBlob(BlobClient blobClient) {
                Response<BlobProperties> propertiesResponse = await blobClient.GetPropertiesAsync();
                BlobHttpHeaders httpHeaders = new BlobHttpHeaders
                {
                    // copy any existing headers you wish to preserve
                    ContentType = propertiesResponse.Value.ContentType,
                    ContentHash = propertiesResponse.Value.ContentHash,
                    ContentEncoding = propertiesResponse.Value.ContentEncoding,
                    ContentDisposition = propertiesResponse.Value.ContentDisposition,
                    // update CacheControl
                    CacheControl = newCacheControl  
                };
                await blobClient.SetHttpHeadersAsync(httpHeaders);
            }
        }
    }
    
    #creat CloudBlobClient 
    Add-Type -Path "C:\Program Files\Microsoft SDKs\Windows Azure\.NET SDK\v2.3\ref\Microsoft.WindowsAzure.StorageClient.dll" 
    $storageCredentials = New-Object Microsoft.WindowsAzure.StorageCredentialsAccountAndKey -ArgumentList $StorageName,$StorageKey 
    $blobClient =   New-Object Microsoft.WindowsAzure.StorageClient.CloudBlobClient($BlobUri,$storageCredentials) 
    #set Properties and Metadata 
    $cacheControlValue = "public, max-age=60480" 
    foreach ($blob in $blobs) 
    { 
      #set Metadata 
      $blobRef = $blobClient.GetBlobReference($blob.Name) 
      $blobRef.Metadata.Add("abcd","abcd") 
      $blobRef.SetMetadata() 
    
      #set Properties 
      $blobRef.Properties.CacheControl = $cacheControlValue 
      $blobRef.SetProperties() 
    }
    
        public async Task BackfillCacheControlAsync()
        {
            var container = await GetCloudBlobContainerAsync();
            BlobContinuationToken continuationToken = null;
    
            do
            {
                var blobInfos = await container.ListBlobsSegmentedAsync(string.Empty, true, BlobListingDetails.None, null, continuationToken, null, null);
                continuationToken = blobInfos.ContinuationToken;
                foreach (var blobInfo in blobInfos.Results)
                {
                    var blockBlob = (CloudBlockBlob)blobInfo;
                    var blob = await container.GetBlobReferenceFromServerAsync(blockBlob.Name);
                    if (blob.Properties.CacheControl != "public, max-age=31536000")
                    {
                        blob.Properties.CacheControl = "public, max-age=31536000";
                        await blob.SetPropertiesAsync();
                    }
                }               
            }
            while (continuationToken != null);
        }
    
        private async Task<CloudBlobContainer> GetCloudBlobContainerAsync()
        {
            var storageAccount = CloudStorageAccount.Parse(_appSettings.AzureStorageConnectionString);
            var blobClient = storageAccount.CreateCloudBlobClient();
            var container = blobClient.GetContainerReference("uploads");
            return container;
        }
    
    # Update Azure Blob Storage blob's cache-control headers
    # /content-cache properties
    # 
    # Quite slow, since there is no `az storage blob update-batch`
    #
    # Created by Jon Tingvold, March 2021
    #
    #
    # If you want progress, you need to install pv:
    # >>> brew install pv  # Mac
    # >>> sudo apt install pv  # Ubuntu
    #
    
    set -e  # exit when any command fails
    
    AZURE_BLOB_CONNECTION_STRING='DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=XXXXXXXXXXXX;AccountKey=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=='
    CONTAINER_NAME=main
    
    BLOB_PREFIX='admin/'
    CONTENT_CACHE='max-age=3600'
    NUM_RESULTS=10000000  # Defaults to 5000
    
    BLOB_NAMES=$(az storage blob list --connection-string $AZURE_BLOB_CONNECTION_STRING --container-name $CONTAINER_NAME --query '[].name' --output tsv --num-results $NUM_RESULTS --prefix $BLOB_PREFIX)
    NUMBER_OF_BLOBS=$(echo $BLOB_NAMES | wc -w)
    
    echo "Ask Azure for files in Blob Storage ..."
    echo "Set content-cache on $NUMBER_OF_BLOBS blobs ..."
    
    for BLOB_NAME in $BLOB_NAMES
    do
      az storage blob update --connection-string $AZURE_BLOB_CONNECTION_STRING --container-name $CONTAINER_NAME --name $BLOB_NAME --content-cache $CONTENT_CACHE > /dev/null;
      echo "$BLOB_NAME"
    
    # If you don't have pv install, uncomment  everything after done
    done | cat | pv -pte --line-mode --size $NUMBER_OF_BLOBS > /dev/null