MS Fabric /ADF architects
Apply NowCompany: Scout Exchange
Location: New York, NY 10025
Description:
Role Name: MS Fabric /ADF architects
Location: NYC, NY
JOB DESCRIPTION:
Deep expertise in modern data architecture, with specific experience in Microsoft's data platform and Delta Lake architecture.
6+ years of experience in data architecture and engineering.
Required 2+ years hands-on experience with Azure Databricks / ADF and Spark.
Required recent experience with Microsoft Fabric platform.
Key Responsibilities:
Data Architecture:
Design end-to-end data architecture leveraging Microsoft Fabric's capabilities.
Design data flows within the Microsoft Fabric environment.
Implement OneLake storage strategies.
Configure Synapse Analytics workspaces.
Establish Power BI integration patterns.
Integration Design:
Architect data integration patterns with analytics using Azure Data Factory and Microsoft Fabric.
Implement medallion architecture (Bronze/Silver/Gold layers).
Ability to configure real-time data ingestion patterns.
Establish data quality frameworks.
Lakehouse Architecture:
Implement modern data lakehouse architecture using Delta Lake, ensuring data reliability and performance.
Data Governance:
Establish data governance frameworks incorporating Microsoft Purview for data quality, lineage, and compliance.
Microsoft Fabric Expertise:
Data Integration: Combining and cleansing data from various sources.
Data Pipeline Management: Creating, orchestrating, and troubleshooting data pipelines.
Analytics Reporting: Building and delivering detailed reports and dashboards to derive meaningful insights from large datasets.
Data Visualization Techniques: Representing data graphically in impactful and informative ways.
Optimization and Security: Optimizing queries, improving performance, and securing data
Azure Databricks Experience:
Apache Spark Proficiency: Utilizing Spark for large-scale data processing and analytics.
Data Engineering: Building and managing data pipelines, including ETL (Extract, Transform, Load) processes.
Delta Lake: Implementing Delta Lake for data versioning, ACID transactions, and schema enforcement.
Cluster Management: Configuring and managing Databricks clusters for optimized performance. (Ex: autoscaling and automatic termination)
Integration with Azure Services: Integrating Databricks with other Azure services like Azure Data Lake, Azure SQL Database, and Azure Synapse Analytics.
Data Governance: Implementing data governance practices using Unity Catalog and Microsoft Purview
Security Framework:
Design and implement security patterns aligned with federal and state requirements for sensitive data handling.
Implement row-level security.
Configure Microsoft Purview policies.
Establish data masking for sensitive information.
Design audit logging mechanisms.
Pipeline Development:
Design scalable data pipelines using Azure Databricks for ETL/ELT processes and real-time data integration.
Performance Optimization:
Implement performance tuning strategies for large-scale data processing and analytics workloads.
Optimize Spark configurations.
Implement partitioning strategies.
Design caching mechanisms.
Establish monitoring frameworks.?
Location: NYC, NY
JOB DESCRIPTION:
Deep expertise in modern data architecture, with specific experience in Microsoft's data platform and Delta Lake architecture.
6+ years of experience in data architecture and engineering.
Required 2+ years hands-on experience with Azure Databricks / ADF and Spark.
Required recent experience with Microsoft Fabric platform.
Key Responsibilities:
Data Architecture:
Design end-to-end data architecture leveraging Microsoft Fabric's capabilities.
Design data flows within the Microsoft Fabric environment.
Implement OneLake storage strategies.
Configure Synapse Analytics workspaces.
Establish Power BI integration patterns.
Integration Design:
Architect data integration patterns with analytics using Azure Data Factory and Microsoft Fabric.
Implement medallion architecture (Bronze/Silver/Gold layers).
Ability to configure real-time data ingestion patterns.
Establish data quality frameworks.
Lakehouse Architecture:
Implement modern data lakehouse architecture using Delta Lake, ensuring data reliability and performance.
Data Governance:
Establish data governance frameworks incorporating Microsoft Purview for data quality, lineage, and compliance.
Microsoft Fabric Expertise:
Data Integration: Combining and cleansing data from various sources.
Data Pipeline Management: Creating, orchestrating, and troubleshooting data pipelines.
Analytics Reporting: Building and delivering detailed reports and dashboards to derive meaningful insights from large datasets.
Data Visualization Techniques: Representing data graphically in impactful and informative ways.
Optimization and Security: Optimizing queries, improving performance, and securing data
Azure Databricks Experience:
Apache Spark Proficiency: Utilizing Spark for large-scale data processing and analytics.
Data Engineering: Building and managing data pipelines, including ETL (Extract, Transform, Load) processes.
Delta Lake: Implementing Delta Lake for data versioning, ACID transactions, and schema enforcement.
Cluster Management: Configuring and managing Databricks clusters for optimized performance. (Ex: autoscaling and automatic termination)
Integration with Azure Services: Integrating Databricks with other Azure services like Azure Data Lake, Azure SQL Database, and Azure Synapse Analytics.
Data Governance: Implementing data governance practices using Unity Catalog and Microsoft Purview
Security Framework:
Design and implement security patterns aligned with federal and state requirements for sensitive data handling.
Implement row-level security.
Configure Microsoft Purview policies.
Establish data masking for sensitive information.
Design audit logging mechanisms.
Pipeline Development:
Design scalable data pipelines using Azure Databricks for ETL/ELT processes and real-time data integration.
Performance Optimization:
Implement performance tuning strategies for large-scale data processing and analytics workloads.
Optimize Spark configurations.
Implement partitioning strategies.
Design caching mechanisms.
Establish monitoring frameworks.?