Job Description
Site Reliability Developer – Oracle Cloud Infrastructure
The Oracle Cloud Infrastructure (OCI) team can provide you the opportunity to build and operate a suite of massive scale, integrated cloud services in a broadly distributed, multi-tenant cloud environment. OCI is committed to providing the best in cloud products that meet the needs of our customers who are tackling some of the world’s biggest challenges.
We offer unique opportunities for smart, hands-on engineers with the expertise and passion to solve difficult problems in distributed highly available services and virtualised infrastructure. At every level, our engineers have a significant technical and business impact designing and building innovative new systems to power our customer’s business-critical applications.
We are looking for a strong Site Reliability Developer (SRD) who will help ensure the availability of our Cloud services 24x7x365. Site Reliability Developers can have near-unlimited scope across availability, durability, maintainability, operability, scalability, automation etc.
The SRD will have a pulse on Oracle’s Commercial Experience’s services at all times and be directly accountable for the troubleshooting and resolution of service issues while continuously working with engineering partners to improve telemetry and automation. Your goal is to improve availability by reducing time to mitigate, ensure we are measuring the right things, and automating tasks that impact development velocity, availability, or productivity. You will leverage excellence in communication, technical/business analysis, problem-solving, and attention to detail to methodically resolve issues. Technically, you will understand the full stack of the services you support and are able to dig deep into the service to determine how to best mitigate customer impact.
Further, you will drive improvements through the development of tools and engage partner teams to drive down incident counts, reduce the severity of events, and minimize Time to Mitigate. We will look to the SRD to continually review and enhance systems, methods, and applications to enable the delivery of a positive customer experience to OCI.
Responsibilities
Responsibilities:
The goal of an SRD is to maximize service availability. During peacetime, the SRD works to maximize the time between service impacting events by hardening the service. During crisis time, the SRD works to minimize the impact on our service.
You identify opportunities to harden our service in areas such as monitoring coverage enhancements and identifying triggers in our monitoring signal that require action (actionable events).
You continue to enhance our standard operating procedure (SOP) coverage, creating documented responses to alerts, automating such responses and finally – automating the entire process by linking responses directly to actionable events
During the most critical service impacting events, you join as Incident Commander and direct SME resources to drive service mitigation. You control all aspects of the event, from resolution strategy to communications and timekeeping.
Communicate with professionalism and precision to internal and external customers during high-priority situations – both written and spoken.
You never leave our services at risk. After a service impacting event, you participate as a driving force in Post Mortems and Critical Repair Items
Participate in the technical analysis related to the multiple Oracle application’s operational and internal tools.
Identify and deploy durable solutions to address complex challenges related to the OCI Commercial Experience Services: Customer Sign-up, Accounts Management, Metering, and integrations with Billing/Contract management systems.
Assist in the training and development of more junior team members.
Desired Skills
B.Tech / M.Tech / M.C.A / M.Sc or any equivalent Degree in Computer Science and Engineering.
Experience of 4+ years in Development/DevOps role with 2-4 years of experience in SRE / DevOps / Cloud Operations in supporting large scale systems in Production.
Proficient in one or more programming languages: Python, Java, with RDBMS skills and scripting skills like BASH/Perl.
Proficiency in the automation of the day to day tasks to reduce MTTD / MTTR and extend MTBF.
Proficient in Infrastructure as code (like Terraform) & DevOps Tooling. Develop self-service infrastructure provisioning, delivery pipelines, log and monitoring services
Experience installing, configuring, securing and maintaining Linux, Linux services, and Linux networking.
Foundational knowledge of the following: XML, JSON, Bitbucket, CI/CD tools like Jenkins/TeamCity
Familiarity with any of the following technologies will be a bonus: Ansible, Docker, Kubernetes, grafana.
Strong Technical background with an ability to troubleshoot issues impacting large-scale service architectures and application stacks.
Troubleshoot any operational issues related to the infrastructure
Systematic problem-solving approach, strong communication skills, a sense of ownership and drive.
Quick learning & thinking in order to solve problems
Experience working in an operational environment with mission critical tier-one services with associated pager duty
Experience with any of the public cloud platforms (like AWS/GCP/Azure/OCI)
About Us
As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s problems. True innovation starts with diverse perspectives and various abilities and backgrounds.
When everyone’s voice is heard, we’re inspired to go beyond what’s been done before. It’s why we’re committed to expanding our inclusive workforce that promotes diverse insights and perspectives.
We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity.
Oracle careers open the door to global opportunities where work-life balance flourishes. We offer a highly competitive suite of employee benefits designed on the principles of parity and consistency. We put our people first with flexible medical, life insurance and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We’re committed to including people with disabilities at all stages of the employment process. If you would like accessibility assistance or accommodation for a disability at any point, let us know at +1.888.404.2494, Option 1.
Disclaimer:
Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.
Oracle is an Equal Employment Opportunity Employer*. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
* Which includes being a United States Affirmative Action Employer
Job Description br{display:none;}.css-58vpdc ul > li{margin-left:0;}.css-58vpdc li{padding:0;}]]> Â Overview: Role Purpose The purpose of the role is to provide assurance...
Apply For This Jobbr{display:none;}.css-58vpdc ul > li{margin-left:0;}.css-58vpdc li{padding:0;}]]> Job Overview To assist in handling and managing all aspects of Vistara operations and customer...
Apply For This JobIntroduction A career in IBM Consulting is rooted by long-term relationships and close collaboration with clients across the globe. You’ll...
Apply For This JobJob Description Summary The Strategic Sourcing Specialist is responsible to develop, implement and lead the sourcing strategy and operating plan...
Apply For This Jobbr{display:none;}.css-58vpdc ul > li{margin-left:0;}.css-58vpdc li{padding:0;}]]> Roles and Responsibilities – Processing Sales and Purchase Invoices. Bank Reconciliation Aged Payables & Receivable...
Apply For This JobJob Description The Senior Security Specialist provides comprehensive security management support to ensure the security and safety of company assets,...
Apply For This Job