Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.
A BS or MS in Computer Science, or equivalent. Identifies solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Identifies solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 5+ years experience of running large scale customer facing web services.
Oracle Cloud Infrastructure – Roving Edge Device (Rover) – SRE Senior Software Engineer At Oracle Cloud Infrastructure (OCI), we build the future of the cloud for Enterprises as a diverse team of fellow creators and inventors. We act with the speed and attitude of a start-up, with the scale and customer-focus of the leading enterprise software company in the world.
Values are OCI’s foundation and how we deliver excellence. We strive for equity, inclusion, and respect for all. We are committed to the greater good in our products and our actions. We are constantly learning and taking opportunities to grow our careers and ourselves. You will be part of a team of really smart, motivated, and diverse people and given the autonomy and support to do your best work.
Oracle Roving Edge Infrastructure accelerates deployment of cloud workloads outside the data center. Ruggedized Oracle Roving Edge Devices (Oracle REDs) deliver cloud computing and storage services at the edge of networks and in disconnected locations.
Site Reliability engineer,
Site Reliability engineer on Roving Edge Device (Rover) provides exposure to breadth of the Cloud technologies right from hardware, Device stack, Cloud stack on IaaS – Compute, Network, Storage & related technologies. Developed using latest & modern cutting edge technologies.
Site Reliability Engineer on Rover, shall work on all layers of cloud – SaaS, IaaS. The SRE is responsible for reliability of Rover to provide operational support by collecting fleet metrics, monitoring, configuration management, analysis, mitigation and remediation of all components of Rover per process, operating procedures and adherence to SLA’s.
Strong knowledge, experience & application of SRE concepts, methodologies to provide remediation/solution is a must.
This position requires excellent oral and written communication skills, strong customer service focus, and ability to work in a team as well as independently, ability to learn on the fly, ability to follow procedures, and ability to suggest improvements to existing procedures as appropriate
Experience in any cloud stack like openstack – compute, network, storage
experience in developing automation or scripts in shell, python, golang
Linux experience, essential administration skills is a MUST like Linux Services & Management – SSH, NTP, DNS, Selinux, kernel config
Experience in compute layer, monitoring & troubleshooting, working on different OS flavors (windows, linux flavors), compute image formats qcow2, img familiarity
Experience in storage layer, block & object storage monitoring, troubleshooting; understanding of general purpose filesystems, lvm, disk utils/usage; performance analysis, data loss conditions
Experience in network monitoring, identifying connectivity problems, performance analysis;
Proficiency in network tools/utilities like SSH, SSH tunnel, VNC, tcpdump, wireshark, iptables, iperf, Proxy, ss, ip, Openssl – SSL/TLS certificate management, nc, nmap port scanning, review kernel tuning params etc., Qualifications
Bachelors Degree in Information Technology, Computer Science or equivalent work experience.
7 or more year’s experience as SRE for cloud or enterprise environments. Preferred Qualifications
MS in Computer Science
experience on cloud environment is a MUST, experience in any OCI service technology stack highly desired.
Experience working on multi-tenant cloud-scale services
Experience in diagnosing, troubleshooting and resolving performance issues in complex environments
Experience in any database like Oracle Database administration, MySQL, NoSQL
Experience in Terraform, Chef, Puppet, Ansible, Tensorflow, Oracle Analytics or other related apps desired. Certifications
Any relevant Certification in Networking (CCNA, JNCP) a plus
Linux System Administration certification or any other related certification a plus
Any Cloud certification like OCI, AWS is a plus
Job Description The Software Engineer will be part of the software development team and participates in all phases of the...
Apply For This JobWelcome to the Latest Job Vacancies Site 2021 and at this time we would like to inform you of the...
Apply For This Jobbr{display:none;}.css-58vpdc ul > li{margin-left:0;}.css-58vpdc li{padding:0;}]]> EM-SALES MANAGER-AGENCY(Job Number: SAL945) Description ABOUT RELIANCE GENERAL INSURANCE: Reliance General Insurance Company Limited is...
Apply For This Jobbr{display:none;}.css-58vpdc ul > li{margin-left:0;}.css-58vpdc li{padding:0;}]]> About the company : “Unique Auto Products Pvt. Ltd” has a colossal working space of...
Apply For This JobMinimum qualifications: Bachelor’s degree or equivalent practical experience 10 years of experience in staffing project teams, using internal and external...
Apply For This JobThe first hospitality brand designed for the evolved, modern global traveller to India. Spree will incorporate elements such as modern...
Apply For This Job