About the role
as the senior data architect, you will lead the architecture, implementation, and technological direction of our enterprise data platforms. Reporting directly to the senior director of data & ai, you will ensure our cloud technology infrastructure supports our critical business data mission. You will operate as a lead advisor and engineer, collaborating with cross-functional leaders across the entire organization to deliver scalable, secure, and cost-optimized data environments
key responsibilities
1. Databricks architecture & platform ownership
* design and implement enterprise-grade databricks lakehouse architecture across bronze, silver, and gold layers.
* own security and governance frameworks within databricks utilizing unity catalog, access controls, and data lineage.
* establish scalable cluster strategies, job orchestration frameworks, and workspace organization.
* lead the architecture of delta lake design patterns, including partitioning, optimization, and data lifecycle management.
* define and enforce data engineering standards, naming conventions, and architectural patterns across all production pipelines.
* evaluate and implement new databricks capabilities to ensure continuous alignment with enterprise data strategy.
2. Data pipeline development & orchestration
* design, build, and optimize robust, end-to-end etl/elt pipelines using azure data factory (adf) and azure databricks.
* develop robust ingestion frameworks for batch and streaming data from apis, databases, saas platforms, and internal systems.
* create scalable and architecturally sound data transformation frameworks using delta lake, spark, and sql, aligned with enterprise lakehouse standards.
* implement ci/cd parameterization, triggers, and pipeline automation best practices.
3. Azure data platform engineering & cost governance
* architect, manage, and optimize enterprise data environments across adls, azure sql, and databricks, including cluster design, cost governance, and workload isolation strategies.
* conduct advanced performance tuning, cluster scaling, proactive monitoring, and cloud cost optimization.
* implement comprehensive dataops practices including automated testing, version control, monitoring, and comprehensive documentation.
4. Data quality, governance & analytics enablement
* build rigorous data validation, auditing, and error-handling frameworks to ensure absolute data accuracy and consistency.
* troubleshoot complex data issues and deliver sustainable, long-term technical solutions.
* partner directly with bi analysts, data scientists, and operational teams to deliver curated, high-performance datasets.
* build reusable data models optimized for business dashboards, predictive analytics, and ai use cases.
* prepare training datasets and feature tables for machine learning pipelines (preferred).
required qualifications & core skills
* experience: 6+ years of hands-on data architecture and enterprise engineering experience, operating at a databricks architect level designing and implementing enterprise-scale data platforms.
* azure databricks & analytics: deep expertise in azure databricks architecture (notebooks, spark, pyspark, delta lake, workflow orchestration).
* azure data factory: extensive experience building and managing adf pipelines, mapping data flows, and integration runtime (ir) management.
* sql mastery: mastery of complex sql logic, performance optimization, advanced analytics queries, and stored procedures.
* data architecture: deep expertise in lakehouse architecture (medallion: bronze/silver/gold) and delta lake optimization techniques.
* governance & security: strong understanding and practical execution of databricks unity catalog, data governance, access controls, and data lineage models.
* platform operations: proven skill in cluster design, workload isolation, performance tuning, and cloud cost optimization across adls and azure sql.
* dataops & ci/cd: robust experience implementing dataops practices (testing, monitoring, version control, documentation) and ci/cd automation using azure devops, github actions, or databricks repos.
* data ingestion: proven proficiency building scalable cloud etl/elt solutions managing batch and streaming ingestion from apis, databases, saas platforms, and internal systems.
highly preferred qualifications (major plus)
* healthcare data domain: direct experience with healthcare data environments, standards, and systems (ehr/emr, hl7, fhir, claims, or revenue cycle management - rcm).
* ai/ml integration: experience supporting ai/ml workflows, feature engineering, or model enablement.
* environment strategy: experience building and maintaining multi-workspace or multi-environment databricks strategies (dev/test/prod).
* data warehousing: familiarity with azure synapse analytics or equivalent cloud warehousing technologies.
* real-time processing: familiarity with real-time distributed processing (structured streaming) within databricks.