The Company: Our client is one of the world's leading social media companies. This platform allows innovative avenues to express creativity, explore interests, and most importantly global connectivity. Having over a billion users, this company pursues the best of the best engineering talent, while also forming dynamic teams who, like the users, are intelligent, compassionate, and creative!
Responsibilities:
- Support global SRE on call rotation and be responsible for Tier-1 online incident response and DevOps support.
- Participate in and enhance the complete service lifecycle, from inception and design, through development, capacity planning, launch reviews, deployment, operation, and refinement.
- Design and implement software platforms and monitoring frameworks to govern service-oriented architecture (SOA) efficiently, automatically, and intelligently.
- Be responsible for service levels of mission critical, revenue-generating E-commerce platform as well as all supporting infrastructure and services. This role will focus on service reliability, highly scalable design, and release management in a cloud-native environment.
- Establish sustainable mechanisms for scaling systems, such as automation, to drive enhancements in reliability, efficiency, and velocity.
- Provide sustainable user support, manage incident responses, and conduct blameless postmortems as part of our ongoing efforts to improve our systems.
Qualifications:
- Bachelor's degree in Computer Science or a related technical field with 5+ YOE
- 3 YOE programming in Python, Go, or C++
- 3 YOE building CI/CD pipelines
- Strong familiarity with Unix/Linux commands, networking fundamentals, and distributed systems
Additional Information:
- Same position available in San Jose and Seattle!
- Work Model: 4-5 days onsite per week
- Pay Structure: Base + Bonus + RSUs Complete, Competitive Benefits Package
- No C2C at this time