Flatten / Parsing XML using Azure ADF (Azure data factory)

 

                         Flatting/Parsing Hierarchical XML using ADF (Azure data factory)


Sharing a method of flatting Hierarchical XML using an inbuild method in ADF  ( since its UI based I felt its easier than databricks option)

 To flatten “Hierarchical XML” you can use “Flatten” transform in ADF Dataflow.

 My source XML structure is mentioned below.



To Flatten the structure , we can create a ADF dataflow similar to screen shot mentioned below and use ‘Flatten’ transformation.


Steps involved :

Step 1: Create a dataflow in ADF

Step 2: Enable “Data flow debug” option.

·       This will create a cluster in the backend, for the dataflow to run.

·       Cost of this option is priced as 1(hour) x 8 (general purpose cores) x $0.274 = $2.19

·       This is required only during the development phase


Step 3: Select the Source, Import Schema in Projection tab


Data Preview:


Expanding the PARTY sub-set


Import Schema:


Step 4: Click on the plus symbol (+) and choose “Flatten” option



·       Select the Unroll By option (understanding: lowest array in the XML) and add the required field mapping

 





·   Refresh the data preview


 


Step 5: Add the sink (target)


Select the Data set as required.




Step 6: Create a pipeline to execute the data flow




·       DB screen shot:






Comments

Post a Comment

Popular posts from this blog

SCD 2 Using Merge statement in SQL server

Integer division to float division in SQL server