VeloDB Cloud
Management Guide
AWS Preparation

AWS Preparation

This article mainly introduces the AWS operations involved in creating a BYOC type warehouse, include Prepare a VPC and subnetLearn about Resource Orchestration and Resource Stack (Optional).

Prepare a VPC and subnet

Before creating a BYOC type warehouse, if there is no existing VPC and subnet that meets the requirements, you need to create a VPC and subnets in advance. Here are the specific operations:

Notice:

  1. If a VPC and subnet that meets the region, availability zone, and subnet requirements exists and you want to deploy the BYOC warehouse in this VPC, skip the following steps to create a new VPC and subnet.
  2. The regions and availability zones currently supported are:
Cloud PlatformRegion NameRegion IDAvailability Zone ID
AWSUS East (N. Virginia)us-east-1use1-az2
AWSUS West (Oregon)us-west-2usw2-az1
AWSEurope (Ireland)eu-west-1euw1-az1
AWSAsia Pacific (Singapore)ap-southeast-1apse1-az1
AWSAsia Pacific (Hong Kong)ap-east-1ape1-az1
AWSMiddle East (Bahrain)me-south-1all

Additional subnet requirements

Because the deployment and management of VeloDB services requires Internet access to AWS's EC2 ELB service (and other services in the future), so we currently support two types of subnets:

1. Private Subnet with external network access (Recommended)
The routing table associated with the subnet contains the 0.0.0.0/0 route to the public NAT gateway. This is the recommended subnet type. In this case, all created machines will access the external network through the public IP address of the shared NAT gateway, which is more secure. It is worth noting that if you choose a private subnet, then we assume that your company intranet and the VPC network are interoperable, otherwise you will not be able to access the WebUI

2. Public Subnet (Not recommended)
The routing table associated with the subnet contains 0.0.0.0/0 routes to the IGW gateway. If you choose a public network subnet, then we will assign a public IP address to all subsequent machines by default, but this is not recommended.

When the resource stack is created by CloudFormation, ensure that the IGW or NAT is in the normal state and the routing table is correctly configured. For subnets that do not meet the above two conditions, we will directly report an error during CloudFormation build and block subsequent execution.

The following is a classic network architecture diagram provided by Amazon Web Services. Subnet 4 is a private subnet with external network access capability, Subnet 1 and Subnet 2 are public subnets. All three of these subnets can access the external network and meet the requirements, while Subnet 3 cannot access the external network and does not meet the requirements.

Create VPC

Open the Amazon Web Services VPC (opens in a new tab) console and switch to the region where you want to deploy the BYOC warehouse.

Click Create VPC to enter the VPC creation page.

Select VPC only, enter the name, input IPv4 CDR, click Create VPC to complete the creation.

Create subnet

Click Subnets > Create subnet on the left to enter the subnet creation page.

We recommend creating two subnets (note that the subnet availability zone ids need to be aligned), one as a public subnet and one as a private subnet, and eventually we will deploy the VeloDB service on the private subnet.

Note: The regions and availability zones currently supported are:

Cloud PlatformRegion NameRegion IDAvailability Zone ID
AWSUS East (N. Virginia)us-east-1use1-az2
AWSUS West (Oregon)us-west-2usw2-az1
AWSEurope (Ireland)eu-west-1euw1-az1
AWSAsia Pacific (Singapore)ap-southeast-1apse1-az1
AWSAsia Pacific (Hong Kong)ap-east-1ape1-az1
AWSMiddle East (Bahrain)me-south-1all

Create an IGW and NAT, configure route table

Create an IGW and attach it to the VPC

Add a route to IGW in the route table of the public network subnet

Create a NAT on the public network subnet

Create a new route table for private subnet and add routes to NAT

Associate this new route table to private subnet

The final network topology should look like the following

Learn about Resource Orchestration and Resource Stack (Optional)

Note: You don't need to do anything in this chapter. If you want to learn more about how it works, you can continue reading.

When executing resource stack templates through the resource orchestration service (CloudFormation) under your cloud account, it will perform related operations on cloud resources such as VPC, EC2, S3, etc., therefore requiring a series of IAM permissions.

Please use administrator privileges to create this resource stack, or contact your administrator to create this resource stack for you, otherwise you may encounter template execution failures due to insufficient permissions.

Resource Orchestration Template Description

The resource orchestration template provided by VeloDB runs under your cloud account, and the template code is visible and auditable, and will not operate on your data and other environments in the VPC. You can get the resource orchestration template provided by VeloDB through the following link:

https://velodb-cloud-online.s3.us-west-1.amazonaws.com/public/aws-byoc.yaml

When you execute the above resource template through AWS CloudFormation, it will automatically create and deploy the Agent. Then the Agent will establish a private connection with VeloDB Cloud and complete the warehouse initialization process.

After the resource orchestration script is executed, you can enter the corresponding warehouse from the VeloDB Cloud platform and start creating a computing cluster for data analysis just like using a normal warehouse.

How to view resource stack information

You can view all resources created through the CloudFormation interface's Resources tab, and view specific resources by resource name:

  • EC2
    • Name: VeloDBAgent (EC2)
    • Purpose: Used to deploy Agent, Prometheus, FluentBit and other programs
  • VPC Endpoint
    • Name: VeloDBEndpoint (VPC Endpoint):
    • Purpose: Establishes private network connection with VeloDB Manage service to pull control instructions and enable one-way push of monitoring and logs
  • S3 Bucket
    • Name: VeloDBBucket (S3 Bucket)
    • Purpose: Used to store data warehouse data
  • SecurityGroup
    • Name: VeloDBSecurityGroupForEndpoint, VeloDBSecurityGroup (VPC SecurityGroup)
    • Purpose: Bound to the endpoint and all EC2 instances created by VeloDB, and restrict inbound and outbound traffic for specific ports and sources through security group rules
  • IAM User / IAM Role
    • Names:
      • VeloDBUser (IAM User), VeloDBAkSk (IAM User AkSk), VeloDBUserPolicy (IAM User Policy)
      • VeloDBControlPlaneRole (IAM Role), VeloDBControlPlaneRolePolicy (IAM Role Policy), VeloDBDataAccessRole (IAM Role), VeloDBDataAccessRolePolicy (IAM Role Policy)
    • Purposes:
      • The created iam user has the minimum permission policy required by the Agent, and all subsequent control operations are carried out using the identity of this iam user
      • The created iam role will be bound to the EC2 instance. Through this role, temporary credentials can be obtained, which is more secure than using the permanent ak/sk method
  • Lambda Function
    • Names:
      • CustomFunction* (Lambda Function logic)
      • CustomResourceRole (temporary role for executing Lambda Function)
    • Purpose: Lambda Function is used to implement logic that is available in Python SDK but not in CF templates. For this template, it mainly includes:
      1. Get lowercase S3 bucket name, as Amazon S3 does not allow uppercase letters in bucket names
      2. Get the information of the user-selected VPC and subnet, such as VPC cidr-block and subnet type
      3. If there is no S3 gateway endpoint within the VPC, a new one will be automatically created, thus enabling the traffic of S3 buckets to be routed within the VPC instead of through the public Internet.

Permissions of iam user created by resource stack templates

After the resource stack template is executed for the first time, an iam user will be created for subsequent management of data warehouse related components in your VPC. The following is a description of the permissions of the iam user.

Note The created sub-user belongs to your cloud account and is only used in your VPC and will not be leaked.

  • Permission summary:

    {
            "Version": "2012-10-17",
            "Statement": [
                    {
                            "Condition": {
                                    "StringEquals": {
                                            "aws:ResourceTag/resource-created-by": [
                                                    "velodb"
                                            ]
                                    }
                            },
                            "Action": [
                                    "ec2:TerminateInstances",
                                    "ec2:StopInstances",
                                    "ec2:StartInstances",
                                    "ec2:RebootInstances",
                                    "ec2:ModifyInstanceAttribute",
                                    "ec2:DescribeSecurityGroups",
                                    "ec2:DescribeSecurityGroupRules",
                                    "ec2:AuthorizeSecurityGroupIngress",
                                    "ec2:AuthorizeSecurityGroupEgress",
                                    "ec2:DeleteSecurityGroup",
                                    "ec2:GetEbsEncryptionByDefault",
                                    "ec2:GetEbsDefaultKmsKeyId"
                            ],
                            "Resource": [
                                    "arn:aws:ec2:us-west-2:*:*"
                            ],
                            "Effect": "Allow"
                    },
                    {
                            "Action": [
                                    "ec2:DescribeVpcs",
                                    "ec2:DescribeSubnets",
                                    "ec2:DescribeAccountAttributes",
                                    "ec2:DescribeAddresses",
                                    "ec2:DescribeInternetGateways",
                                    "ec2:DescribeInstances",
                                    "ec2:DescribeAvailabilityZones",
                                    "ec2:DescribeInstanceTypes",
                                    "ec2:DescribeVolumes",
                                    "ec2:ModifyVolume",
                                    "ec2:DescribeImages",
                                    "ec2:RunInstances",
                                    "ec2:CreateSecurityGroup",
                                    "ec2:DescribeTags",
                                    "ec2:CreateTags",
                                    "ec2:DeleteTags",
                                    "compute-optimizer:GetEnrollmentStatus",
                                    "elasticloadbalancing:*"
                            ],
                            "Resource": "*",
                            "Effect": "Allow"
                    },
                    {
                            "Condition": {
                                    "StringEquals": {
                                            "aws:ResourceTag/resource-created-by": [
                                                    "velodb"
                                            ]
                                    }
                            },
                            "Action": [
                                    "s3:*"
                            ],
                            "Resource": [
                                    "arn:aws:s3:::velodb-bucket-0629f1d324e3859ba/*",
                                    "arn:aws:s3:::velodb-bucket-0629f1d324e3859ba"
                            ],
                            "Effect": "Allow"
                    },
                    {
                            "Action": [
                                    "sts:GetCallerIdentity",
                                    "sts:AssumeRole",
                                    "iam:CreateInstanceProfile"
                            ],
                            "Resource": "*",
                            "Effect": "Allow"
                    },
                    {
                            "Condition": {
                                    "StringEquals": {
                                            "iam:PassedToService": [
                                                    "ec2.amazonaws.com"
                                            ]
                                    }
                            },
                            "Action": [
                                    "iam:PassRole",
                                    "iam:AddRoleToInstanceProfile"
                            ],
                            "Resource": "arn:aws:iam::*:role/velodb-*",
                            "Effect": "Allow"
                    },
                    {
                            "Condition": {
                                    "StringEquals": {
                                            "iam:AWSServiceName": [
                                                    "elasticloadbalancing.amazonaws.com"
                                            ]
                                    }
                            },
                            "Action": [
                                    "iam:CreateServiceLinkedRole"
                            ],
                            "Resource": "*",
                            "Effect": "Allow"
                    }
            ]
    }

The specific permissions are divided as follows:

  • EC2 & VPC permissions:

    • Manage EC2 and security groups

      {
              "Condition": {
                      "StringEquals": {
                              "aws:ResourceTag/resource-created-by": [
                                      "velodb"
                              ]
                      }
              },
              "Action": [
                      "ec2:TerminateInstances",
                      "ec2:StopInstances",
                      "ec2:StartInstances",
                      "ec2:RebootInstances",
                      "ec2:ModifyInstanceAttribute",
                      "ec2:DescribeSecurityGroups",
                      "ec2:DescribeSecurityGroupRules",
                      "ec2:AuthorizeSecurityGroupIngress",
                      "ec2:AuthorizeSecurityGroupEgress",
                      "ec2:DeleteSecurityGroup",
                      "ec2:GetEbsEncryptionByDefault",
                      "ec2:GetEbsDefaultKmsKeyId"
              ],
              "Resource": [
                      "arn:aws:ec2:us-west-2:*:*"
              ],
              "Effect": "Allow"
      },
    • Get VPC related resource information

      {
          "Action": [
              "ec2:DescribeVpcs",
              "ec2:DescribeSubnets",
              "ec2:DescribeAccountAttributes",
              "ec2:DescribeAddresses",
              "ec2:DescribeInternetGateways",
              "ec2:DescribeInstances",
              "ec2:DescribeAvailabilityZones",
              "ec2:DescribeInstanceTypes",
              "ec2:DescribeVolumes",
              "ec2:ModifyVolume",
              "ec2:DescribeImages",
              "ec2:RunInstances",
              "ec2:CreateSecurityGroup",
              "ec2:DescribeTags",
              "ec2:CreateTags",
              "ec2:DeleteTags",
              "compute-optimizer:GetEnrollmentStatus",
          ],
          "Resource": "*",
          "Effect": "Allow"
      },
  • ELB permissions:

    • Manage Elastic Load Balancer (ELB) resources

      elasticloadbalancing:*
  • S3 permissions:

    • Manage S3 buckets and perform read/write operations on buckets and their contents (for specific buckets)

      {
          "Condition": {
                  "StringEquals": {
                          "aws:ResourceTag/resource-created-by": [
                                  "velodb"
                          ]
                  }
          },
          "Action": [
                  "s3:*"
          ],
          "Resource": [
                  "arn:aws:s3:::velodb-bucket-*/*",
                  "arn:aws:s3:::velodb-bucket-*"
          ],
          "Effect": "Allow"
      }
  • IAM & STS permissions:

    • IAM & STS service related

      {
              "Action": [
                      "sts:GetCallerIdentity",
                      "sts:AssumeRole",
                      "iam:CreateInstanceProfile"
              ],
              "Resource": "*",
              "Effect": "Allow"
      },
      {
              "Condition": {
                      "StringEquals": {
                              "iam:PassedToService": [
                                      "ec2.amazonaws.com"
                              ]
                      }
              },
              "Action": [
                      "iam:PassRole",
                      "iam:AddRoleToInstanceProfile"
              ],
              "Resource": "arn:aws:iam::*:role/velodb-*",
              "Effect": "Allow"
      },
      {
              "Condition": {
                      "StringEquals": {
                              "iam:AWSServiceName": [
                                      "elasticloadbalancing.amazonaws.com"
                              ]
                      }
              },
              "Action": [
                      "iam:CreateServiceLinkedRole"
              ],
              "Resource": "*",
              "Effect": "Allow"
      }