Site Reliability Engineer Job at EVONA, San Francisco, CA

OFRJL0QxdXFDcDIyZW9Ja2dnU1hqeWcvN0E9PQ==
  • EVONA
  • San Francisco, CA

Job Description

Site Reliability Engineer (SRE)

Location : San Francisco Bay Area

Role Overview :

We are seeking a highly skilled Site Reliability Engineer (SRE) to join a dynamic team at a rapidly growing technology company. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of mission-critical systems, while implementing automation and optimizing cloud infrastructure. This role offers the opportunity to work with cutting-edge AI/ML technologies , leveraging them to solve complex challenges in cloud infrastructure management and performance optimization.

Key Responsibilities :

  • System Reliability & Performance : Design, implement, and maintain scalable systems, ensuring high availability, performance, and disaster recovery across production environments.
  • Automation & Tool Development : Develop automation tools to streamline operations, improve system reliability, and reduce manual interventions.
  • Cloud Infrastructure Management : Create and manage cloud instances (e.g., dev, staging, production) using AWS, GCP, or Azure, optimizing infrastructure performance and cost.
  • Integration of AI/ML Models : Collaborate with engineering teams to integrate machine learning models into production environments, ensuring that these models scale efficiently and perform optimally.
  • Incident Management : Respond to and resolve incidents, minimizing downtime and ensuring quick recovery. Lead post-incident reviews and implement preventive measures.
  • Continuous Improvement : Identify areas of improvement and drive initiatives to enhance system reliability, performance, and security.
  • Security & Compliance : Ensure that infrastructure and applications adhere to security best practices and compliance standards.

Qualifications :

  • Educational Background : Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • Experience : Proven experience as a Site Reliability Engineer or in a similar role within a SaaS environment , managing and optimizing cloud infrastructure (preferably AWS, GCP, or Azure), and familiarity with integrating AI and machine learning technologies.
  • Technical Skills :
  • Proficiency in programming and scripting languages such as Python, Go, or Bash.
  • Experience with containerization and orchestration tools like Docker and Kubernetes.
  • Solid understanding of networking, security , and performance optimization practices.
  • Knowledge of CI/CD pipelines and DevOps practices to ensure smooth development and deployment cycles.
  • Problem-Solving : Strong analytical and problem-solving skills with attention to detail.
  • Collaboration & Communication : Excellent interpersonal skills, with the ability to work collaboratively in cross-functional teams and communicate technical concepts clearly.

Benefits :

  • Competitive Salary : Attractive compensation package, including equity options.
  • Health & Wellness : Comprehensive health, dental, and vision insurance, along with other benefits.
  • Work Environment : A collaborative and innovative work environment within a growing company.
  • Growth Opportunities : Opportunities for career growth, professional development, and a chance to shape the future of the company’s technology and infrastructure.

Job Tags

Similar Jobs

Ultimate Staffing

Lead Shipping & Receiving Job at Ultimate Staffing

 ...Lead also supervises a team of shipping/receiving clerks and warehouse staff to ensure smooth, efficient workflow. Key Responsibilities...  ....) Prepare and inspect outgoing shipments to ensure proper packaging, labeling, and compliance with customer and carrier... 

QuaverEd

Music Training Specialist Job at QuaverEd

 ...related Music field preferred ~ Additional Certifications or National Affiliations preferred ~ Training and/or instructional coaching experience preferred ~ Experience with online curricula and curriculum development, use of QuaverMusic preferred ~ Experience... 

CompassRx

Chief of Staff Job at CompassRx

 ...What do we do at CompassRx: CompassRx is a seed-stage startup seeking its a Chief of Staff as one of its first hires. Were using AI to help hospitals combat new pharmacy regulations that are jeopardizing hospitals ability to survive. Most systems have already reported... 

Dura-Fibre, LLC

Maintenance Technician Job at Dura-Fibre, LLC

 ...Summary To support the facility by performing all necessary maintenance related functions safely. This position will interact directly...  ...learning their preventive maintenance activities Maintain all building mechanicals, utilities, and HVAC systems Perform minor... 

Veterans United Solutions

Field Service Technician Job at Veterans United Solutions

 ...NOW HIRING!~ Veterans United Solutions LLC is hiring telecommunications field service technicians with experience operating and managing telecommunications and networking devices to work with the California Department of Corrections and Rehabilitation across the state...