Data Engineer

Login to Apply

Job Description

ETL Developer with Azure

Rocky Hill CT – Full time /Perm

Open for H1B transfers

Job Description: We are looking for an experienced Senior ETL Developer to join our team at Rocky Hill, CT. This is an onsite position for a developer with 9+ years of industry experience in ETL development, SSIS, and working with cloud technologies and Data Lakes. The ideal candidate will also have experience using PySpark for large-scale data processing. A background in the healthcare industry is preferred but not required.

As a Senior ETL Developer, you will be responsible for designing, implementing, and maintaining data pipelines, working on the integration of various data sources into cloud-based Data Lakes while leveraging your expertise in PySpark for big data processing.

Key Responsibilities:

* Develop and implement ETL solutions using SSIS for integrating data across multiple platforms.
* Design, implement, and manage cloud-based Data Lakes (preferably using Azure Data Lake or AWS S3).
* Leverage PySpark for distributed data processing, ensuring scalable and efficient data transformations.
* Collaborate with data architects and business analysts to design optimal data workflows for various data sources.
* Manage the cloud-based data infrastructure to ensure efficient data ingestion and integration into data lakes.
* Optimize ETL workflows and data pipelines for performance and scalability.
* Work with large datasets from diverse sources (relational databases, cloud services, APIs, etc.).
* Implement data governance, data quality, and security best practices across data pipelines.
* Troubleshoot and resolve issues related to data integration, performance, and cloud data architecture.
* Provide mentorship and guidance to junior team members on best practices for cloud-based ETL development and data engineering.
* Participate in CI/CD pipelines for automating the deployment of ETL jobs and cloud resources.
* Ensure compliance with relevant data privacy regulations, including HIPAA, for healthcare-related data.
Qualifications:

* 9+ years of hands-on experience in ETL development.
* Strong expertise in SQL Server Integration Services (SSIS) for building and managing ETL workflows.
* Proven experience working with cloud technologies (Azure) and cloud-based Data Lakes (e.g., Azure Data Lake).
* Proficiency in PySpark for big data processing and distributed data transformation.
* Experience with data lake architecture.
* Strong skills in data modeling, T-SQL, and managing data pipelines.
* Deep understanding of data governance, data quality, and data privacy best practices.
* Familiarity with CI/CD processes for automating ETL deployment and cloud infrastructure management.
* Experience with data orchestration tools such as Azure Data Factory.
* Healthcare industry experience is a plus (knowledge of EHR/EMR systems, HIPAA compliance, and healthcare data formats).
* Excellent analytical, troubleshooting, and communication skills.
* Strong collaboration skills and ability to work with cross-functional teams.