overviewas a senior advanced data engineer here at honeywell, you will play a crucial role in designing, developing, and maintaining advanced data solutions that drive business insights and support decision-making processes.
you will leverage your expertise in data engineering to build scalable data pipelines, optimize data storage, and ensure data quality and integrity.
your ability to work with cross-functional teams and translate business requirements into technical solutions will be key to your success in this role.
in this role, you will impact the business by enabling data-driven decision-making, optimizing data processes, and improving overall data management.
your work will contribute to increased operational efficiency, cost savings, and enhanced customer satisfaction.responsibilitiesai-ready data platform: design and implement end-to-end ingestion pipelines from heterogeneous sources, including snowflake, sql server, excel, rest apis, and unstructured data, into azure databricks.data modeling & semantic layer: architect and enforce medallion architecture (bronze → silver → gold) ensuring data arrives clean, validated, and fit for purpose at each layer.delta lake & data ops: build delta live tables (dlt) pipelines with declarative data quality expectations, schema evolution, and automated lineage tracking; implement incremental loading patterns using cdc, watermarking, and delta lake merge/upsert for efficient ingestion.data processing: enable structured and unstructured data processing (documents, excel files, json, parquet) to build the foundation for ai and ml consumption.orchestration & data opsbuild and manage databricks workflows with multi-task dependencies, sla monitoring, retry logic, and alerting.implement ci/cd pipelines for databricks using azure devops and github actions, including python wheel packaging for reusable utility libraries deployed across the platform.apply software engineering best practices: version control, unit testing, modular code design, and automated deployment to dev/qa/prod environments.cluster right-sizing, dbu management, delta table optimization (vacuum, compaction), and cost monitoring across azure databricks and gcp.data governance & qualityimplement and manage unity catalog for centralized data governance: three-level namespace (catalog → schema → table), fine-grained rbac, data masking, and audit logging.build data quality frameworks: rule-based validation, deduplication, reconciliation, and anomaly detection to ensure data arrives fit for ai/ml consumption.establish data lineage tracking across ingestion, transformation, and serving layers.govern data delivery to gcp: ensuring secure, validated, schema-consistent outputs consumed by downstream data science and analytics teams.ai & proactive analytics foundationdesign pipelines that are ai-ready from day one: supporting structured ml feature pipelines, embedding generation, and future vector db integrations.build the data infrastructure that enables the shift from descriptive dashboards to proactive, predictive analytics.collaborate with data scientists and analytics engineers to ensure the gold layer supports model training, feature stores, and real-time inference pipelines.qualificationsdatabricks: 4+ years hands-on experience with pyspark, delta lake, workflows, unity catalog.demonstrated expertise in data strategy, e.g., medallion architecture, domain data modeling and functional data architecture.data quality frameworks (e.g., rule-based validation, anomaly detection).
data pipelines: incremental loading, cdc, ci/cd, observability.advanced python/pyspark and advanced sql.strongly preferred: dlt, uc, gcp, azure, kafka.databricks certified professional is highly valued.7+ years of overall data engineering experience.4+ years of hands-on azure databricks experience in production environments.proven experience building platforms, not just maintaining them: greenfield builds, migrations, framework development.experience with financial, engineering, enterprise, or industrial-scale datasets preferred.demonstrated ability to own technical decisions end-to-end: from architecture to production deployment.
#j-*-ljbffr