Data Scientist + ML Mix with AWS Sagemaker
Apply NowCompany: Futran Tech Solutions Pvt. Ltd.
Location: Houston, TX 77084
Description:
Data Scientist + ML Mix with AWS Sagemaker
Location: Houston, TX or Denver, CO (Hybrid 3 days a week)
As a Data Scientist
Experience with AWS suite of Services/Resources
Agile framework
CICD Pipeline Orchestration
Building data pipelines (ETL) and using tools like Apache Spark
Frontend/Interactive Visualizations - Streamlit, Plotly
Programming Languages - Python, SQL, OOP - Object Oriented Programming
ML Frameworks/Open Source Libraries - scikit-learn, TensorFlow, PyTorch, Keras and others
Experience with Data Science Lifecycle - Acquisition, data processing, building models, optimization, deployment and maintenance
MLOps Frameworks - working with model inferences and endpoints
Container Orchestration - Docker, Kubernetes (working with Kubernetes)
As a Machine Learning Engineer, build and maintain large scale ML Infrastructure and ML pipelines. Contribute to building advanced analytics, machine learning platform and tools to enable both prediction and optimization of models. Extend existing ML Platform and frameworks for scaling model training & deployment.
Considerable years of experience in developing ML infrastructure and MLOps in the Cloud using AWS Sagemaker.
Extensive experience working with machine learning models with respect to deployment, inference, tuning, and measurement required.
Experience with building data pipelines in getting the data required to build and evaluate ML models, using tools like Apache Spark or other distributed data processing frameworks.
Data movement technologies (ETL/ELT), Messaging/Streaming Technologies (AWS SQS, Kinesis/Kafka), Relational and NoSQL databases (DynamoDB, EKS, Graph database), API and in-memory technologies.
Strong knowledge of developing highly scalable distributed systems using Open-source technologies.
Experience with CI/CD tools (e.g., Jenkins or equivalent), version control (Git), orchestration/DAGs tools (AWS Step Functions, Airflow, Luigi, Kubeflow, or equivalent).
Solid experience in Agile methodologies (Kanban and SCRUM).
Skills must have
Must have strong technical design and analysis skills.
Must have the ability to deal with ambiguity and work in fast paced environment.
Must have experience supporting critical applications.
Must be familiar with applied data science methods, feature engineering and machine learning algorithms.
Must have Data wrangling experience with structured, semi-structure and unstructured data.
Must have experience building ML infrastructure, with an eye towards software engineering.
Must have excellent communication skills, both through written and verbal channels.
Must have excellent collaboration skills to work with multiple teams in the organization.
Must be able to understand and adapt to changing business priorities and technology advancements in Big data and Data Science ecosystem.