Job Description:
The ideal candidate will be responsible for the administration, configuration, and management of AWS Elastic MapReduce (EMR) clusters and associated applications. Candidate will also manage and optimize a range of AWS services including Glue, Redshift, Lambda, Step Functions, S3, IAM, EC2, Kafka. You will be responsible for ensuring the smooth operation, security, and efficiency of cloud-based systems while leveraging cloud services to build scalable, automated, and secure solutions.
#### **Key Responsibilities**:
- AWS EMR Cluster Management:
- Deploy, configure, and manage AWS EMR clusters, ensuring optimal performance and scalability.
- Install, configure, and maintain applications running on EMR clusters, such as Hadoop, Spark & Hive.
- Monitor cluster performance and resource utilization, implementing tuning strategies to optimize the performance and cost-efficiency of EMR clusters.
- Implement and manage backup and recovery processes for EMR data and configurations.
- Develop and maintain scripts for automating the deployment and management of EMR clusters and applications.
- Diagnose and resolve issues related to EMR clusters, applications, and data processing jobs.
- Work closely with data engineers, data scientists, and other stakeholders to support their big data processing needs.
- AWS Service Management:
- Administer and manage a variety of AWS services including AWS Glue, Amazon Redshift, AWS Lambda, Amazon S3, Amazon MSK (Kafka), KMS, and Secrets Manager to support data processing, storage, security, and compute needs.
- Implement AWS Step Functions to orchestrate workflows and integrate services like Lambda and Glue for end-to-end automation.
- Configure, manage, and optimize S3 buckets for efficient data storage, including setting up lifecycle policies and access controls.
- Performance Optimization & Automation:
- Optimize the performance and cost-effectiveness of AWS Lambda, Glue, and Redshift deployments through continuous monitoring and best practices.
- Automate infrastructure provisioning and management using Terraform.
- Leverage AWS CloudWatch to monitor performance metrics, set alarms, and implement automated responses to scaling events or failures.
- Collaboration & Support:
- Work with development and data engineering teams to design and implement scalable data pipelines and analytics solutions using Redshift, Glue, and S3.
- Provide operational support for troubleshooting issues related to AWS services, performance bottlenecks, and data inconsistencies.
- Ensure high availability and reliability of critical systems through proactive monitoring, patching, and maintenance.
- Documentation & Best Practices:
- Document architectural designs, configurations, and security policies to maintain a comprehensive knowledge base of the infrastructure.
- Stay up-to-date with the latest AWS developments and recommend improvements to systems, services, and workflows.
- Educate team members on AWS best practices for security, scalability, and cost optimization.
#### **Qualifications**:
- **Education**: Bachelor’s degree in computer science, Information Technology, or a related field.
- **Experience**:
- Proven experience with AWS services, particularly EMR, EC2, S3, and IAM.
- Strong background in administering big data technologies like Hadoop, Spark, Hive, and Presto.
- Experience with Linux/Unix systems administration.
- Experience with AWS services including Glue, Redshift, Lambda, Step Functions, S3, Kafka, KMS, Secrets Manager, IAM
- Strong understanding of AWS security best practices, including managing IAM roles, KMS, Secrets Manager, and encryption standards.
- Experience with serverless architectures using AWS Lambda and Step Functions.
- Familiarity with data processing, ETL pipelines, and real-time data streaming using AWS Glue and MSK (Kafka).
- Experience with infrastructure-as-code tools such as Terraform is preferred.
- **Skills**:
- Strong knowledge of AWS networking concepts, security protocols, and best practices.
- Proficiency in scripting languages such as Python, Shell, for automation tasks.
- Hands-on experience with monitoring tools like AWS CloudWatch and logging/alerting services.
- Good communication and collaboration skills to work with cross-functional teams.
- **Certifications**: AWS Certified Solutions Architect, AWS Certified Big Data – Specialty, or other relevant certifications are a plus.
#### **Preferred Qualifications**:
- Understanding of data warehousing concepts and ETL processes.
- Knowledge of data security practices and compliance standards like GDPR and HIPAA.