Amazon web services 用于创建sparkUI历史记录服务器的CF模板失败

Amazon web services 用于创建sparkUI历史记录服务器的CF模板失败,amazon-web-services,pyspark,yaml,amazon-cloudformation,spark-ui,Amazon Web Services,Pyspark,Yaml,Amazon Cloudformation,Spark Ui,创建历史记录服务器的默认CF模板包括创建安全组和IAM角色。 我将两者都删除并添加以选择现有的安全组。 现在,当我运行CF模板时,它成功地创建了HistoryServerInstance,但在等待条件下失败。 你们能帮我解决我的问题吗。 错误屏幕截图和附加的脚本 谢谢 yaml中的我的CF模板: Parameters: InstanceType: Type: String Default: t3.medium AllowedValues: - t3.m

创建历史记录服务器的默认CF模板包括创建安全组和IAM角色。 我将两者都删除并添加以选择现有的安全组。 现在,当我运行CF模板时,它成功地创建了HistoryServerInstance,但在等待条件下失败。 你们能帮我解决我的问题吗。 错误屏幕截图和附加的脚本

谢谢

yaml中的我的CF模板:


Parameters:
  InstanceType:
    Type: String
    Default: t3.medium
    AllowedValues:
      - t3.micro
      - t3.small
      - t3.medium
    Description: Instance Type for EC2 instance which hosts Spark history server.
  LatestAmiId:
    Type: AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>
    Description: Latest AMI ID of Amazon Linux 2 for Spark history server instance. You can use the default value.
    Default: /aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2
  InstanceSecurityGroup:
    Description: "Select Security Group"
    Type: AWS::EC2::SecurityGroup::Id
  VpcId:
    Type: AWS::EC2::VPC::Id
    Description: "VPC ID for Spark history server instance."
    Default: '' 
  SubnetId:
    Type: AWS::EC2::Subnet::Id
    Description: Subnet ID for Spark history server instance.
    Default: ''
  IpAddressRange:
    Type: String
    Description: "IP address range that can be used to view the Spark UI."
    MinLength: 9
    MaxLength: 18
  HistoryServerPort:
    Type: Number
    Description: History Server Port for the Spark UI.
    Default: 18080
    MinValue: 1150
    MaxValue: 65535
  EventLogDir:
    Type: String
    Description: "*Event Log Directory* where Spark event logs are stored from the Glue job or dev endpoints. You must use s3a:// for the event logs path scheme"
    Default: s3a://hcg-stagingaas6377-sandbox/logs/
  SparkPackageLocation:
    Type: String
    Description: You can use the default value.
    Default: 'https://archive.apache.org/dist/spark/spark-2.4.3/spark-2.4.3-bin-without-hadoop.tgz'
  KeystorePath:
    Type: String
    Description: SSL/TLS keystore path for HTTPS. If you want to use custom keystore file, you can specify the S3 path s3://path_to_your_keystore_file here. If you leave this parameter empty, self-signed certificate based keystore is used.
  KeystorePassword:
    Type: String
    NoEcho: true
    Description: SSL/TLS keystore password for HTTPS. A valid password can contain 6 to 30 characters.
    MinLength: 6
    MaxLength: 30

Metadata:
  AWS::CloudFormation::Interface:
    ParameterGroups:
      -
        Label:
          default: Spark UI Configuration
        Parameters:
          - IpAddressRange
          - HistoryServerPort
          - EventLogDir
          - SparkPackageLocation
          - KeystorePath
          - KeystorePassword
      -
        Label:
          default: EC2 Instance Configuration
        Parameters:
          - InstanceType
          - LatestAmiId
          - VpcId
          - SubnetId

Mappings:
  MemoryBasedOnInstanceType:
    t3.micro:
      SparkDaemonMemory: '512m'
    t3.small:
      SparkDaemonMemory: '1g'
    t3.medium:
      SparkDaemonMemory: '3g'

Resources:
  HistoryServerInstance:
    Type: AWS::EC2::Instance
    Properties:
      ImageId: !Ref LatestAmiId
      InstanceType: !Ref InstanceType
      SubnetId: !Ref SubnetId
      SecurityGroupIds: 
      - !Ref InstanceSecurityGroup
      UserData:
        'Fn::Base64': !Sub |
          #!/bin/bash -xe
          yum update -y aws-cfn-bootstrap
          echo "CA_OVERRIDE=/etc/pki/tls/certs/ca-bundle.crt" >> /etc/environment
          export CA_OVERRIDE=/etc/pki/tls/certs/ca-bundle.crt
          rpm -Uvh https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
          /opt/aws/bin/cfn-init -v -s ${AWS::StackName} -r HistoryServerInstance --region ${AWS::Region}
          /opt/aws/bin/cfn-signal -e -s ${AWS::StackName} -r HistoryServerInstance --region ${AWS::Region}
    Metadata:
      AWS::CloudFormation::Init:
        configSets:
          default:
            - cloudwatch_agent_configure
            - cloudwatch_agent_restart
            - spark_download
            - spark_init
            - spark_configure
            - spark_hs_start
            - spark_hs_test
        cloudwatch_agent_configure:
          files:
            /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json:
              content: !Sub |
                  {
                    "logs": {
                      "logs_collected": {
                        "files": {
                          "collect_list": [
                            {
                              "file_path": "/var/log/cfn-init.log",
                              "log_group_name": "/aws-glue/sparkui_cfn/cfn-init.log"
                            },
                            {
                              "file_path": "/opt/spark/logs/spark-*",
                              "log_group_name": "/aws-glue/sparkui_cfn/spark_history_server.log"
                            }
                          ]
                        }
                      }
                    }
                  }
        cloudwatch_agent_restart:
          commands:
            01_stop_service:
              command: /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a stop
            02_start_service:
              command: /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json -s
        spark_download:
          packages:
            yum:
              java-1.8.0-openjdk: []
              maven: []
              python3: []
              python3-pip: []
          sources:
            /opt: !Ref SparkPackageLocation
          commands:
            create-symlink:
              command: ln -s /opt/spark-* /opt/spark
            export:
              command: !Sub |
                echo "export JAVA_HOME=/usr/lib/jvm/jre" | sudo tee -a /etc/profile.d/jdk.sh
                echo "export SPARK_HOME=/opt/spark" | sudo tee -a /etc/profile.d/spark.sh
                export JAVA_HOME=/usr/lib/jvm/jre
                export SPARK_HOME=/opt/spark
            download-pom-xml:
              command: curl -o /tmp/pom.xml https://aws-glue-sparkui-prod-us-east-1.s3.amazonaws.com/public/mvn/pom.xml
            download-setup-py:
              command: curl -o /tmp/setup.py https://aws-glue-sparkui-prod-us-east-1.s3.amazonaws.com/public/misc/setup.py
            download-systemd-file:
              command: curl -o /usr/lib/systemd/system/spark-history-server.service https://aws-glue-sparkui-prod-us-east-1.s3.amazonaws.com/public/misc/spark-history-server.service
        spark_init:
          commands:
            download-mvn-dependencies:
              command: cd /tmp; mvn dependency:copy-dependencies -DoutputDirectory=/opt/spark/jars/
            install-boto:
              command: pip3 install boto --user; pip3 install boto3 --user
          files:
            /opt/spark/conf/spark-defaults.conf:
              content: !Sub |
                spark.eventLog.enabled                      true
                spark.history.fs.logDirectory               ${EventLogDir}
                spark.history.ui.port                       0
                spark.ssl.historyServer.enabled             true
                spark.ssl.historyServer.port                ${HistoryServerPort}
                spark.ssl.historyServer.keyStorePassword    ${KeystorePassword}
              group: ec2-user
              mode: '000644'
              owner: ec2-user
            /opt/spark/conf/spark-env.sh:
              content: !Sub
                - |
                  export SPARK_DAEMON_MEMORY=${SparkDaemonMemoryConfig}
                  export SPARK_HISTORY_OPTS="$SPARK_HISTORY_OPTS -Dspark.hadoop.fs.s3.impl=org.apache.hadoop.fs.s3a.S3AFileSystem"
                - SparkDaemonMemoryConfig: !FindInMap [ MemoryBasedOnInstanceType, !Ref InstanceType, SparkDaemonMemory ]
              group: ec2-user
              mode: '000644'
              owner: ec2-user
        spark_configure:
          commands:
            create-symlink:
              command: ln -s /usr/lib/systemd/system/spark-history-server.service /etc/systemd/system/multi-user.target.wants/
            enable-spark-hs:
              command: systemctl enable spark-history-server
            configure-keystore:
              command: !Sub |
                python3 /tmp/setup.py --keystore "${KeystorePath}" --keystorepw "${KeystorePassword}" > /tmp/setup_py.log 2>&1
        spark_hs_start:
          commands:
            start_spark_hs_server:
              command: systemctl start spark-history-server
        spark_hs_test:
          commands:
            check-spark-hs-server:
              command: !Sub |
                curl --retry 60 --retry-delay 10 --retry-max-time 600 --retry-connrefused https://localhost:${HistoryServerPort} --insecure;
                /opt/aws/bin/cfn-signal -e $? "${WaitHandle}"
  WaitHandle:
    Type: AWS::CloudFormation::WaitConditionHandle
  WaitCondition:
    Type: AWS::CloudFormation::WaitCondition
    DependsOn: HistoryServerInstance
    Properties:
      Handle: !Ref WaitHandle
      Timeout: 1200

Outputs:
  SparkUiPublicUrl:
    Description: The Public URL of Spark UI
    Value: !Join
      - ''
      - - 'https://'
        - !GetAtt 'HistoryServerInstance.PublicDnsName'
        - ':'
        - !Ref HistoryServerPort
  SparkUiPrivateUrl:
    Description: The Private URL of Spark UI
    Value: !Join
      - ''
      - - 'https://'
        - !GetAtt 'HistoryServerInstance.PrivateDnsName'
        - ':'
        - !Ref HistoryServerPort
  CloudWatchLogsCfnInit:
    Description: CloudWatch Logs Console URL for cfn-init.log in History Server Instance
    Value: !Join
      - ''
      - - 'https://console.aws.amazon.com/cloudwatch/home?region='
        - !Ref AWS::Region
        - '#logEventViewer:group=/aws-glue/sparkui_cfn/cfn-init.log;stream='
        - !Ref HistoryServerInstance
  CloudWatchLogsSparkHistoryServer:
    Description: CloudWatch Logs Console URL for spark history server logs in History Server Instance
    Value: !Join
      - ''
      - - 'https://console.aws.amazon.com/cloudwatch/home?region='
        - !Ref AWS::Region
        - '#logEventViewer:group=/aws-glue/sparkui_cfn/spark_history_server.log;stream='
        - !Ref HistoryServerInstance

参数:
实例类型:
类型:字符串
默认值:t3.5中等
允许值:
-t3.micro
-t3.小型
-t3.中等
Description:托管Spark history server的EC2实例的实例类型。
LatestAmiId:
类型:AWS::SSM::参数::值
Description:Spark history服务器实例的Amazon Linux 2的最新AMI ID。可以使用默认值。
默认值:/aws/service/ami亚马逊linux最新版本/amzn2-ami-hvm-x86_64-gp2
InstanceSecurityGroup:
描述:“选择安全组”
类型:AWS::EC2::SecurityGroup::Id
VpcId:
类型:AWS::EC2::VPC::Id
描述:“Spark history server实例的VPC ID。”
默认值:“”
子网:
类型:AWS::EC2::子网::Id
描述:Spark history server实例的子网ID。
默认值:“”
IP地址范围:
类型:字符串
描述:“可用于查看Spark UI的IP地址范围。”
最小长度:9
最大长度:18
历史服务器端口:
类型:编号
描述:Spark UI的历史服务器端口。
默认值:18080
最小值:1150
最大值:65535
EventLogDir:
类型:字符串
描述:“*事件日志目录*,其中存储来自粘合作业或开发人员终结点的Spark事件日志。必须使用s3a://作为事件日志路径方案”
默认值:s3a://hcg-stagingas6377-sandbox/logs/
火花包位置:
类型:字符串
说明:您可以使用默认值。
默认值:'https://archive.apache.org/dist/spark/spark-2.4.3/spark-2.4.3-bin-without-hadoop.tgz'
密钥复制路径:
类型:字符串
描述:HTTPS的SSL/TLS密钥库路径。如果要使用自定义密钥库文件,可以在此处指定S3路径S3://path\u到您的密钥库文件。如果将此参数留空,将使用基于自签名证书的密钥库。
密钥密码:
类型:字符串
诺乔:是的
描述:HTTPS的SSL/TLS密钥库密码。有效密码可以包含6到30个字符。
最小长度:6
最大长度:30
元数据:
AWS::CloudFormation::接口:
参数组:
-
标签:
默认值:Spark UI配置
参数:
-IP地址范围
-历史服务器端口
-EventLogDir
-火花包定位
-密钥复制
-密钥密码
-
标签:
默认值:EC2实例配置
参数:
-实例类型
-迟发性
-VpcId
-子网
映射:
MemoryBasedOnInstanceType:
t3.micro:
SparkDaemonMemory:'512m'
t3.小型:
SparkDaemonMemory:'1g'
t3.1中等:
SparkDaemonMemory:“3g”
资源:
HistoryServerInstance:
类型:AWS::EC2::实例
特性:
图像ID:!参考LatestAmiId
实例类型:!Ref实例类型
子网:!参考子网
SecurityGroupId:
- !Ref InstanceSecurityGroup
用户数据:
'Fn::Base64':!潜艇|
#!/bin/bash-xe
yum更新-y aws-cfn引导
echo“CA_OVERRIDE=/etc/pki/tls/certs/CA bundle.crt”>>/etc/environment
导出CA_OVERRIDE=/etc/pki/tls/certs/CA-bundle.crt
rpm-Uvhhttps://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
/opt/aws/bin/cfn init-v-s${aws::StackName}-r HistoryServerInstance--region${aws::region}
/opt/aws/bin/cfn signal-e-s${aws::StackName}-r HistoryServerInstance--region${aws::region}
元数据:
AWS::CloudFormation::Init:
配置集:
违约:
-cloudwatch\u代理\u配置
-cloudwatch\u代理\u重新启动
-spark_下载
-火花点火
-spark_配置
-火花点火起动
-火花试验
cloudwatch_代理_配置:
文件夹:
/opt/aws/amazon cloudwatch agent/etc/amazon-cloudwatch-agent.json:
内容:!潜艇|
{
“日志”:{
“收集的日志”:{
“文件”:{
“收集列表”:[
{
“文件路径”:“/var/log/cfn init.log”,
“日志组名称”:“/aws glue/sparkui\u cfn/cfn init.log”
},
{
“文件路径”:“/opt/spark/logs/spark-*”,
“日志组名称”:“/aws glue/sparkui\u cfn/spark\u history\u server.log”
}
]
}
}
}
}
cloudwatch\u代理\u重新启动:
命令:
01_停止_服务:
命令:/opt/aws/amazon cloudwatch agent/bin/amazon cloudwatch agent ctl-停止
02\u启动\u服务:
命令:/opt/aws/amazon cloudwatch agent/bin/amazon cloudwatch agent ctl-a fetch config-m ec2-c file:/opt/aws/amazon cloudwatch agent/etc/amazon-cloudwatch-agent.json-s
spark_下载:
包装:
百胜:
java-1.8.0-openjdk:[]
马文:[]
蟒蛇3:[]
蟒蛇3号管道:[]
资料来源:
/选项:!参考火花包位置
命令:
创建符号链接:
命令:ln-s/opt/spark-*/opt/spark
出口:
命令:!潜艇|