DataOpsSuite Deployment
In Account A we have to deploy AWS Postgres RDS and EKS to deploy DataOpsServer
DataOpsServer is running on 6055, we have to expose this to the end user and AWS Postgres RDS port 5432 should have access from DataOpsServer and EMR Instances in Account B
In Account B we have to deploy EMR as a DataOpsEngine.
Engine service is running on 8998 port, it should have access from enduser and DataOpsServer.
Account A Requirements:
AWS Postgres RDS
- Provision an RDS instance with Postgres.
- Instance type: 4 core, 16 GB RAM.
- Storage: 100 GB with autoscaling enabled.
EKS Cluster
- Create an EKS cluster.
- Node pool with 1 node:
- Instance type: m6i.2xlarge.
- Add label:
app: dataopsserver
.
- Create a datagaps namespace.
- Set up a Persistent Volume Claim (PVC):
- Name:
datagapspvc
. - Backed by EFS file system.
- Name:
- Create a role in Account A:
- Must be able to assume roles in Account B.
- Create a Service Account:
- Name:
datagaps
in thedatagaps
namespace.
- Name:
- Create Role and Role Bindings:
- Use the
role&rolebinding.yaml
file for role and role bindings in thedatagaps
namespace within the EKS cluster.
- Use the
- Create a Secret for RDS password:
- Name:
db-pwd
containing the Postgres RDS password. - Command to create the secret:
kubectl -n datagaps create secret generic db-pwd --from-literal=DB_PASSWORD=<db-password>
- Name:
- Annotate the
datagaps
service account with the role in Account A:kubectl annotate serviceaccount datagaps -n datagaps eks.amazonaws.com/role-arn=arn:aws:iam::<account-number>:role/<role-name>
- Deploy the
dataopsserver.yaml
file:- Update Environment Variables before deployment.
- expose the dataopsserver service which is running on port 6055
S3 Bucket
- Create an S3 bucket.
- Add a bucket policy to allow access from Account B.
- Copy the required EMR files into the bucket:
- Files from the link: DataOps EMR v2024.4.0.0.
- Update the datagaps.properties file:
- Located in
jars/files/datagaps.properties
. - Replace the first three lines with the corresponding values from the server file at
/opt/datagaps/DataOpsServer/conf/datagaps.properties
.
- Located in
- Update the CloudFormation template (
emr-imdsv2_enforced.yaml
):- Adjust for required subnets, key pairs, instance types, roles, and bucket name.
- Use the updated file provided.
Account B Requirements:
- Role Creation
- Create a role to manage the CloudFormation stack and EMR.
- Grant access to the S3 bucket in Account A.
- Ensure the role can be assumed by a role in Account A (refer to
AccountBpolicy.json
).