Big Data, Hadoop Developer

Apply Now

Company: Sarian, Inc.

Location: Charlotte, NC 28269

Description:

Hadoop

Charlotte NC

Hadoop Developer Minimum 7+ years exp.
Extensive Knowledge on ETL and Teradata.
Good exposure to Hadoop Eco system.
Minimum 1 year hands on exp in Pyspark.
Job scheduling tools (e.g. Autosys) & Version control tool like Git .
Unix shell scripting.
Basic knowledge on Mainframe, should be able to navigate through the jobs and code.
Quick learner and self-starter who requires minimal supervision to excel in a dynamic environment.
Strong Verbal and written Communication skills.
Prior experience of working with globally distributed teams
Agile driven development Hadoop Architect Responsibilities
Minimum 7 years of experiance. Be responsible for the design and support of the API(s) between IBM Spectrum Conductor (as a distributed compute and storage platform), and the Bank of America application that exposes a data scientist user experience and model governance.
Capture cluster tenant compute and storage functional and nonfunctional requirements; and translate into distributed cluster capacity, configuration, and user provisioning settings.
Develop, test, and analyze code/scripts written in PySpark, Python, Java, and other shell scripts, to provide specified behavior on a distributed IBM Spectrum Conductor cluster.
Provide "how-to" technical support for tenant developers developing runtimes and persisting data on a distributed IBM Spectrum Conductor cluster.
Be an active member of the Agile scrum, and be a part of the features that emerge from the team.
Perform peer code and test case reviews, and help foster a healthy technical community by helping peers.

Qualifications:
Experience with Agile/Scrum methodology is essential.
Experience with either Apache Spark-On-YARN (Hadoop) or Apache Spark-On-EGO (IBM Spectrum Conductor) is essential.
Experience with Apache Spark Libraries: PySpark, Spark SQL, Spark Streaming, MLlib are essential.
Experience with either Hadoop/YARN or IBM Spectrum Conductor/EGO cluster resource manager is essential.
Experience with RedHat Linux (RHEL) command line and shell scripting are essential.
Experience with file formats CSV, JSON, ORC, Avro, Parquet, Protocol Buffers are essential.
Experience with Python, Java, and R are highly desirable.
Experience with Numpy, and Pandas are highly desirable.
Experience with designing and configuring distributed architectures are desirable.
Knowledge of CI/CD SDLC practices.
Knowledge of Scikit-Learn, PyTorch, Keras, H20.ai.
Strong communication skills, should be able to communicate effectively with business and other stakeholders.
Demonstrate ownership and initiative taking Hadoop Platform Engineer / Developer The successful candidate must have general knowledge of the Hadoop ecosystem including Cloudera, Hive, Impala, Spark, Oozie, Kafka, HBASE etc. The candidate should possess a proven ability to understand technical issues in a multitenant platform-as-a-service environment and work with peer teams to facilitate problem resolution. A focus on tenant relations (customer service) and communication is a key attribute when coordinating with tenants and other interdependent service teams.

This is a fast paced, continuously evolving platform. In addition to supporting tenant needs you must be passionate about automating processes and simplifying the overall operating environment. Candidates will be a team player and able to communicate effectively both orally and written. You must be able to understand new technical offerings and learn new technologies in the context of the multitenant platform.

Essential Functions:
Provide innovative tactical solutions when necessary to meet operational requirements
Provide BAU application support to business partners. Manage technical issues with support from internal and external resources
Work directly with business and technology partners to translate business requirements into effective, efficient technology solutions
Learn new products/tools/technologies to guide business partners in implementing new Big Data features and functionality
Experienced in Agile development practices and product owner role
Platform capacity management, utilization, forecasting holistically from lower lanes to production for all tenants

Key Skills
Customer focused mindset
Strong interpersonal/influence skills
Ability to dissect complex issues and leverage/coordinate platform resources to resolve technical issues
Familiar developing within Hadoop platforms
Understanding of BI tools
ETL experience with emphasis on performance and scalability
Ability to handle multiple and simultaneous activities and priorities
Ability to lead and organize the work of other analyst
ITSM/Remedy - Change Management knowledge
Jira/Confluence - Agile
Exposure to the following technologies
o Spark/Impala - ability to trace logic (application/query) through logs
o YARN
o HIVE
o Cloudera Manager
o Kafka
o Scripting/programming/development experience - Shell/Python/Java
o Linux scripting/admin
Key Career Experiences
Previous platform-as-a-service exposure
Application development/support
Application Hosting Administrator
Infrastructure Design or Support Lead

Similar Jobs