While Livy Server is in Windows OS
The following are the steps to create a data source connection for Kerberized HDFS .:
- Obtain the Kerberos Principle name, hdfs.keytab, and krb5.conf for the environments which you are trying to connect. place them in your desired folder. (make sure that folder path don't have any spaces)
- Rename krb5.conf as krb5.ini then place it in C:\windows\
- Log in to DataOps Suite.
- On the side menu, click Data Sources.
- Click Data Source +.
- Select a File type (example CSV, XML, AVRO, and so on) and select Files in HDFS Location.
- Specify the following Data Source Information.
- Name. Enter a name for the connection.
- DataSystem. Select the Datasystem under which this connection should be created.
- Format. This is the format of the file selected. The format is auto-filled based on the file that you have selected.
- Extensions. This is the extension of the file selected. This is auto-filled based on the file that you have selected.
- File Location. Specify the location of the file on the HDFS machine. Example location is hdfs://"hostname":"port"/"path"/.
- Is Kerberos Enabled. Select this checkbox to enable Kerberos for Hive connection.
- Kerberos Conf Location. Specify the Kerberos Conf file location as c:/windows/krb5.conf
- Principal. Specify the Principal Name. Example Principal name is hdfs/quickstart.cloudera@CLOUDERA
- KeyTab file location. Specify the keytab file location as per livy server host (Example location is c:/hdfs/hdfs.keytab).
- Click Save to save the connection.
- if the hdfs Datasource hostname and kerberos realm hostname are not resolvable then make an entry in the file "c:\windows\system32\drivers\etc\hosts" (append at the end of file "ipaddress hostname")
For Linux OS
The following are the steps to create a HDFS connection with Kerberos on Linux operating system:
- Obtain the Kerberos Principle name, hdfs.keytab, and krb5.conf for the environments which you are trying to connect. place them in your desired folder.
- Log in to DataOps Suite.
- On the side menu, click Data Sources.
- Click Data Source +.
- Select a File type (example CSV, XML, AVRO, and so on) and select Files in HDFS Location.
- Specify the following Data Source Information.
- Name. Enter a name for the connection.
- DataSystem. Select the Datasystem under which this connection should be created.
- Format. This is the format of the file selected. The format is auto-filled based on the file that you have selected.
- Extensions. This is the extension of the file selected. This is auto-filled based on the file that you have selected.
- File Location. Specify the location of the file on the HDFS machine. Example location is hdfs://"hostname":"port"/path/.
- Is Kerberos Enabled. Select this checkbox to enable Kerberos for Hive connection.
- Kerberos Conf Location. Specify the Kerberos Conf file location path
- Principal. Specify the Principal Name. Example Principal name is hdfs/quickstart.cloudera@CLOUDERA
- KeyTab file location. Specify the keytab file location path
- Click Save to save the connection.
- if the hdfs Datasource hostname and kerberos realm hostname are not resolvable then make an entry in the file "/etc/hosts" (append at the end of file "ipaddress hostname")