Site Reliability Engineer

Posted 2 years ago

Job description

Visa Status: Must be a US Citizen


  • Managing infrastructure services, responsible for including but not limited to deployment, operation, and troubleshooting at our Client.
  • Maintain services to meet service-level-agreements (SLAs) or service-level-objective (SLOs) by measuring and monitoring availability, performance, and overall system health
  • Provide user support, incident responses, RCA and postmortems to our client.
  • Support on-call rotations for operational duties

Basic Qualifications:

  • Bachelor’s degree in Computer Engineering, Computer Science or related major
  • 3+ years of experience in Kubernetes and Docker
  • 3+ years of experience working in Redis and/or MongoDB, Kafka and/or RocketMQ
  • 3+ years of scripting experience in Shell and Python, infrastructure automation (e.g., Terraform, Ansible, Chef)
  • 2+ years of CI/CD software development lifecycles
  • 2+ years of experience in in one or more of the following types of systems at their newest versions:
  • Flink
  • MySQL
  • ElasticSearch
  • HDFS
  • Mesos and/or Yarn
  • Spark and/or Hive
  • Experience with Unix/Linux operating systems
  • Experience in debugging and automating routine tasks
  • Strong skills in problem solving and communication
  • Excellent team player
    Preferred qualifications:
  • Experience of supporting/managing systems at scale (10s thousands to 100s thousands of instances) is a big plus
  • Information Security
    Or you can email your resume to

Job Features

Job CategorySite Reliability Engineer

Apply Online

A valid email address is required.