Junior Data Consultant
Apply NowCompany: Virtusa Corporation
Location: San Francisco, CA 94112
Description:
Description
We are seeking a Data Engineer with 2+ years of experience in designing, developing, and optimizing data pipelines and cloud-based analytics solutions. The ideal candidate will have a strong background in building scalable ETL/ELT processes, integrating structured and unstructured data, and supporting data-driven decision-making across finance, product, and operations teams.
Responsibilities:
Design, implement, and maintain data pipelines on AWS, GCP, and Azure for real-time and batch data ingestion into cloud-based data warehouses (Redshift, BigQuery, Snowflake).
Develop and optimize ETL/ELT processes using tools like Apache Spark, dbt, Airflow, and Azure Data Factory to ensure efficient data workflows.
Automate reporting and data processing tasks to improve operational efficiency.
Implement best practices for data governance, security, and compliance using cloud-native tools and IAM policies.
Collaborate with cross-functional teams, including product, analytics, and data science, to deliver actionable insights and build impactful data dashboards.
Write efficient, optimized SQL queries and assist in database schema design and table partitioning to improve query performance.
Leverage data visualization tools such as Tableau, Power BI, and Looker to create meaningful reports and dashboards.
Skills & Qualifications:
Strong proficiency in cloud platforms (AWS, GCP, Azure) and data services (Redshift, BigQuery, Snowflake, Dataflow).
Expertise in Python, SQL (T-SQL, PL/SQL, SparkSQL), and experience with Spark, Kafka, and other big data technologies.
Solid understanding of data warehousing concepts and building data models for scalability and performance.
Hands-on experience with DevOps tools (Terraform, Docker, Kubernetes) and version control (Git, GitHub Actions).
Strong problem-solving skills with a passion for using data to uncover insights and improve business efficiency.
We are seeking a Data Engineer with 2+ years of experience in designing, developing, and optimizing data pipelines and cloud-based analytics solutions. The ideal candidate will have a strong background in building scalable ETL/ELT processes, integrating structured and unstructured data, and supporting data-driven decision-making across finance, product, and operations teams.
Responsibilities:
Design, implement, and maintain data pipelines on AWS, GCP, and Azure for real-time and batch data ingestion into cloud-based data warehouses (Redshift, BigQuery, Snowflake).
Develop and optimize ETL/ELT processes using tools like Apache Spark, dbt, Airflow, and Azure Data Factory to ensure efficient data workflows.
Automate reporting and data processing tasks to improve operational efficiency.
Implement best practices for data governance, security, and compliance using cloud-native tools and IAM policies.
Collaborate with cross-functional teams, including product, analytics, and data science, to deliver actionable insights and build impactful data dashboards.
Write efficient, optimized SQL queries and assist in database schema design and table partitioning to improve query performance.
Leverage data visualization tools such as Tableau, Power BI, and Looker to create meaningful reports and dashboards.
Skills & Qualifications:
Strong proficiency in cloud platforms (AWS, GCP, Azure) and data services (Redshift, BigQuery, Snowflake, Dataflow).
Expertise in Python, SQL (T-SQL, PL/SQL, SparkSQL), and experience with Spark, Kafka, and other big data technologies.
Solid understanding of data warehousing concepts and building data models for scalability and performance.
Hands-on experience with DevOps tools (Terraform, Docker, Kubernetes) and version control (Git, GitHub Actions).
Strong problem-solving skills with a passion for using data to uncover insights and improve business efficiency.