R&D Infrastructure Research Engineer
Apply NowCompany: Tata Consultancy Services
Location: Plano, TX 75025
Description:
Strong knowledge of infrastructure research including, specifically GPU computing performance. (5-7 years of infrastructure research experience).
Expert in Kubernetes, containers, Distributed Systems, Python, systems tools, and scripts programming, with 10+ years of hands-on experience in these areas.
Excellent knowledge of AI infrastructure optimization, debugging, and tuning skills.
Analyze benchmarking data and draw/recommend optimization insights.
Responsible for Design, implement, and manage robust infrastructure solutions that meet benchmark requirements.
Hands-on experience in Continuous Integration and Continuous Delivery (CI/CD)
Strong skills to evaluate the information/data and Performance fine tune servers, compute (CPU/GPU), network, and databases to optimize performance.
Build, install and maintain containers environment including K8s, rancher, kubeflow, etc.
Expertise in architecting, building and managing large R&D data sets and Implementing High Performance Computing (HPC)
Ability to troubleshoot AI infrastructure including servers/GPUs, network, and storage.
Must conduct comprehensive benchmarking to evaluate system performance, reliability, and scalability.
Deep understanding in overall technology architecture and deep multidisciplinary experience including servers, storage, network, databases, containers, compute (CPU/GPU)
Base Salary Range: $130,000 - $150,000 per annum
#LI-SV2
Expert in Kubernetes, containers, Distributed Systems, Python, systems tools, and scripts programming, with 10+ years of hands-on experience in these areas.
Excellent knowledge of AI infrastructure optimization, debugging, and tuning skills.
Analyze benchmarking data and draw/recommend optimization insights.
Responsible for Design, implement, and manage robust infrastructure solutions that meet benchmark requirements.
Hands-on experience in Continuous Integration and Continuous Delivery (CI/CD)
Strong skills to evaluate the information/data and Performance fine tune servers, compute (CPU/GPU), network, and databases to optimize performance.
Build, install and maintain containers environment including K8s, rancher, kubeflow, etc.
Expertise in architecting, building and managing large R&D data sets and Implementing High Performance Computing (HPC)
Ability to troubleshoot AI infrastructure including servers/GPUs, network, and storage.
Must conduct comprehensive benchmarking to evaluate system performance, reliability, and scalability.
Deep understanding in overall technology architecture and deep multidisciplinary experience including servers, storage, network, databases, containers, compute (CPU/GPU)
Base Salary Range: $130,000 - $150,000 per annum
#LI-SV2