Skip to main content
Similar jobs pay
$9.40 - $20.45
Columbus, OH 43214
Updated 30+ days ago

As a Site Reliability Engineer (SRE) Manager, you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation. You'll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment, you'll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. As an SRE, you'll be focused on running better production applications and systems.

In this role as an SRE, you will be providing production support to the Huntington Digital team on cloud & other technologies. You will be working with engineers to build the platform, pipelines and monitor systems to ensure the application landscape is designed to best take advantage of application solutions.

What you'll be doing

• Implement SRE frameworks to support globally multi-technology environments, and ensure the highest level of SLA through operational excellence

• Provide failure analysis / root cause analysis when required

• Provide support to develop & improve the quality of technical engineering documentation

• Provide support to drive the maturity of the software development lifecycle

• Provide quality control of engineering deliverables

• Provide technical consultation to product management

• Perform deployment, administration, management, configuration, testing, and integration tasks related to the platforms

• Help to develop new engineering strategies and implementations for the firm

• Champion a DevOps model so that services are automated and elastic across all platforms

• Help coach and mentor less experienced team members.

• Write operational documentation and knowledge base of known issues with solutions

• Ready to participate in 24x7 SRE on-call rotations and escalation workflows as needed, such as on occasional weekends

Basic Qualifications:

• Bachelor's Degree

• Minimum of 7 years of IT experience with expertise in Digital application's mission critical environment

Preferred Qualifications:

• In-depth OS experience (RHEL, Ubuntu, Windows Server) with strong debugging, troubleshooting, and problem-solving skills

• Expertise in programming language Python or Java with focus on Site Reliability Engineering and support of app services

• Hands-on experience with cloud-based technologies and tools especially in deployment, monitoring and operations, such as Data Dog, Prometheus, Splunk, ElasticSearch, Grafana. Dynatrace

• Strong working knowledge of modern development technologies and tools such Agile, CI/CD, Git, Terraform and Jenkins

• Deep knowledge of Internet protocols and web services technologies such as HTTP, DNS, TCP/UDP, SOAP, JSON and REST

• Good understanding of networking protocols and cybersecurity best practices in cloud environment

• Familiarity with success measurement (SLI/SLO/SLA)

• Experience in PowerShell, shell scripting is highly desirable

• Excellent problem-solving skills, and the ability to troubleshoot complex issues quickly and effectively

• Able to find opportunities for improvement and tackle them without external direction

• Ability to "think outside of the box" and find creative solutions to operational problems

• Dedication to collaboration, "teaching others to fish"-style knowledge sharing and cross training.

• Excellent communication skills

• Ability to operate in a fast paced environment

• Self-motivated & willing to learn

EEO/AA Employer/Minority/Female/Disability/Veteran/Sexual Orientation/Gender Identity

Tobacco-Free Hiring Practice: Visit Huntington's Career Web Site for more details.

Agency Statement: Huntington does not accept solicitation from Third Party Recruiters for any position

Posting ID: 644830642 Posted: 2021-12-15 Job Title: Site Reliability Engineer