Connecting from Databricks

Required Elements

1.Storage Account name

2.Access key

3.Container format

4.Path of files location


In Databricks Advanced Spark Configurations Add the below line by replacing the Storage account name and access Key  then restart Cluster


fs.azure.account.key.<storage-account-name>.dfs.core.windows.net  <Access Key>


In the Dataops Application select file type then select storage location as Files in Local Server

in the file location section mention the below path by replacing it with your environment values



abfss://<container file system>@<storage-account-name>.dfs.core.windows.net/<directorypath>


after saving the connection in the dataflow after selecting file connection you need to mention the file name then run component.


Required Settings for Standalone Dataops Engine for Connecting ADLS Gen2 files


Need to download the below files and place them in the Installation directory DataOpsEngine/spark/jars/

https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-azure/3.2.1/hadoop-azure-3.2.1.jar

https://repo1.maven.org/maven2/org/wildfly/openssl/wildfly-openssl-java/2.1.0.Final/wildfly-openssl-java-2.1.0.Final.jar


Add the below lines to spark-defaults.conf which is located in DataOpsEngine/spark/conf/

fs.defaultFS abfss://<containerfilesystem>@<storageaccountname>.dfs.core.windows.net

fs.azure.account.key.<storageaccountname>.dfs.core.windows.net <accesskey>



Make the core-site.xml file with the below content after replacing the environment details place it in DataOpsEngine/spark/conf/



<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


<configuration>

     <property> 

     <name>fs.defaultFS</name> 

     <value>abfss://<container file system>@<Storage account name>.dfs.core.windows.net</value> 

    

     </property>


    <property>

        <name>fs.azure.account.auth.type.<Storage Account name>.dfs.core.windows.net</name>

         <value>SharedKey</value>

 

         </property>

     <property>

          <name>fs.azure.account.key.<Storage Account name>.dfs.core.windows.net</name>

            <value><Access key></value>

   

       </property>



</configuration>



after this stop and start DataOpsEngine 


In the Dataops Application select file type then select storage location as Files in Local Server

in the file location section mention the below path by replacing it with your environment values



abfss://<container file system>@<storage-account-name>.dfs.core.windows.net/<directorypath>


after saving the connection in the dataflow after selecting file connection you need to mention the file name then run the component.