DataOpsSuite Deployment


In Account A we have to deploy AWS Postgres RDS and EKS to deploy DataOpsServer

DataOpsServer is running on 6055, we have to expose this to the end user and AWS Postgres RDS port 5432 should have access from DataOpsServer and EMR Instances in Account B

In Account B we have to deploy EMR as a DataOpsEngine. 

Engine service is running on 8998 port, it should have access from enduser and DataOpsServer.

Account A Requirements:

  1. AWS Postgres RDS

    • Provision an RDS instance with Postgres.
    • Instance type: 4 core, 16 GB RAM.
    • Storage: 100 GB with autoscaling enabled.
  2. EKS Cluster

    • Create an EKS cluster.
    • Node pool with 1 node:
      • Instance type: m6i.2xlarge.
      • Add label: app: dataopsserver.
    • Create a datagaps namespace.
    • Set up a Persistent Volume Claim (PVC):
      • Name: datagapspvc.
      • Backed by EFS file system.
    • Create a role in Account A:
      • Must be able to assume roles in Account B.
    • Create a Service Account:
      • Name: datagaps in the datagaps namespace.
    • Create Role and Role Bindings:
      • Use the role&rolebinding.yaml file for role and role bindings in the datagaps namespace within the EKS cluster.
    • Create a Secret for RDS password:
      • Name: db-pwd containing the Postgres RDS password.
      • Command to create the secret:
        kubectl -n datagaps create secret generic db-pwd --from-literal=DB_PASSWORD=<db-password>
    • Annotate the datagaps service account with the role in Account A:
      kubectl annotate serviceaccount datagaps -n datagaps eks.amazonaws.com/role-arn=arn:aws:iam::<account-number>:role/<role-name>
    • Deploy the dataopsserver.yamlfile:
      • Update Environment Variables before deployment.
      • expose the dataopsserver service which is running on port 6055 
  3. S3 Bucket

    • Create an S3 bucket.
    • Add a bucket policy to allow access from Account B.
    • Copy the required EMR files into the bucket:
    • Update the datagaps.properties file:
      • Located in jars/files/datagaps.properties.
      • Replace the first three lines with the corresponding values from the server file at /opt/datagaps/DataOpsServer/conf/datagaps.properties.
    • Update the CloudFormation template (emr-imdsv2_enforced.yaml):
      • Adjust for required subnets, key pairs, instance types, roles, and bucket name.
      • Use the updated file provided.

Account B Requirements:

  1. Role Creation
    • Create a role to manage the CloudFormation stack and EMR.
    • Grant access to the S3 bucket in Account A.
    • Ensure the role can be assumed by a role in Account A (refer to AccountBpolicy.json).