Prerequisities

1. For the DataOpsEngine yarn cluster setup we use minimum of 3 nodes(one master node and two worker nodes)


2. Install the ansible in master node.

            sudo yum install epel-release -y

            sudo yum install ansible -y

3. One common user should be available in all nodes with root priviliges


4. All the nodes should able to communicate each other and itself through passwordless ssh(through the above common user with hostnames)


Hardware Requirements:


1. Memory - 128GB


2. CPU Cores- 16 Cores


3. Hard disk size - 500GB(In installation path)


Skill Reuirements:


1. To Deploy the yarn cluster throught ansible, Deployment user must have aware Linux Administration


2. After the deployment, to manager and monitor the hadoop cluster user requires hadoop fundementals



Cluster Environment Setup


1. Download the DataOpsEngine cluster deployment file provided by datagaps in the master node


2. Extract the downloaded file

           

2. Create one file called workers and add all worker node hostnames in that file


3.  Open host.yaml  file and add the masternode hostname in namenode section, add the nodes in worker node section and add the master and nodes in app section


4. In same host.yaml file add the root privileged user in ansible_user section(common user)




Steps:

1. Run the below command for Hadoop installation from the common user

          the below yaml file will prompt for common user password, master node hostname, hadoop installaion location, deployment user(common user) and deployment user home directory(common user)

  

                    ansible-playbook hadoop.yaml --ask-become-pass -i host.yaml

       


2. Run the below command to start the hadoop and yarn clusters

          the below yaml file will prompt for common user password


                    ansible-playbook starthadoop.yaml --ask-become-pass -i host.yaml

                     start-yarn.sh


3. Run the below command to install the spark and livy

             the above yaml file will prompt for common user password, master node hostname, hadoop installaion location and deployment user(common user)


                    ansible-playbook spark.yaml --ask-become-pass -i host.yaml