You have an Azure subscription that contains a Microsoft Purview account named MP1, an Azure data factory named DF1, and a storage account named storage. MP1 is configured10 scan storage1. DF1 is connected to MP1 and contains 3 dataset named DS1. DS1 references 2 file in storage.In DF1, you plan to create a pipeline that will process data from DS1.You need to review the schema and lineage information in MP1 for the data referenced by DS1.Which two features can you use to locate the information? Each correct answer presents a complete solution.NOTE: Each correct answer is worth one point.
You have an Azure Data Lake Storage account that contains a staging zone.You need to design a daily process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics.Solution: You use an Azure Data Factory schedule trigger to execute a pipeline that executes mapping data Flow, and then inserts the data info the data warehouse.Does this meet the goal?
You have an Azure Synapse Analytics dedicated SQL pool that contains a large fact table. The table contains 50 columns and 5 billion rows and is a heap.Most queries against the table aggregate values from approximately 100 million rows and return only two columns.You discover that the queries against the fact table are very slow.Which type of index should you add to provide the fastest query times?
You plan to use an Apache Spark pool in Azure Synapse Analytics to load data to an Azure Data Lake Storage Gen2 account.You need to recommend which file format to use to store the data in the Data Lake Storage account. The solution must meet the following requirements:* Column names and data types must be defined within the files loaded to the Data Lake Storage account.* Data must be accessible by using queries from an Azure Synapse Analytics serverless SQL pool.* Partition elimination must be supported without having to specify a specific partition.What should you recommend?
You plan to create an Azure Data Factory pipeline that will include a mapping data flow.You have JSON data containing objects that have nested arrays.You need to transform the JSON-formatted data into a tabular dataset. The dataset must have one tow for each item in the arrays.Which transformation method should you use in the mapping data flow?