DataOpsSuite Deployment

In Account A we have to deploy AWS Postgres RDS and EKS to deploy DataOpsServer

In Account B we have to deploy EMR as a DataOpsEngine


Account A requirements

1.AWS Postgres RDS with 4 core 16Gb instance with 100GB autoscaling storage

2.EKS Cluster

  1. Require node pool with 1 node with configuration m6i.2xlarge and label app: dataopsserver
  2. Create datagaps name space
  3. PVC (name datagapspvc)with efs file system
  4. Create role in Account A which has assume role capability to Account B's role
  5. Create service account with name datagaps in datagaps namespace
  6. Create role and role bindings in datagaps namespace in eks cluster. check the attached file role&rolebinding.yaml.
  7. Create a secret with the name db-pwd which has postgres rds password 
  8. kubectl –n datagaps create secret generic db-pwd --from-literal=DB_PASSWORD=<db password>
  9. Annotate datagaps service account with role in account A 
  10. .kubectl annotate serviceaccount datagaps --n datagaps eks.amazonaws.com/role-arn: arn:aws:iam::<accountnumber>:role/<rolename>
  11. deploy dataopsserver.yaml after updating the Environment variables

3.S3 Bucket

  1. add a bucket policy to allow access to Account B
  2. copy required EMR files into the bucket from the link https://datagaps.s3.amazonaws.com/dataops-suite/DataOpsEMR_v2024.4.0.0.zip 
  3. update the first three lines of the file which is in jars/files/datagaps.properties with the content from server datagaps.properties. /opt/datagaps/DataOpsServer/conf/datagaps.properties.
  4. update cloudformation yaml with required subnets,keypair,instance type,roles and bucketname. take updated yaml file from the attached file  emr-imdsv2_enforced.yaml


Account B


1. create a role to manage  cloudformation stack and EMR and also need to give access to the S3 bucket in Account A and to be assumed by role in Account A(please check AccountBpolicy.json)