Unlimited Job Postings Subscription - $99/yr!

Job Details

Site Reliability Engineer Lead

  2026-04-30     Patterson-UTI Energy     Houston,TX  
Description:

Site Reliability Engineer Lead

We are seeking an Site Reliability Engineer Lead to own and evolve the reliability, scalability, and operational excellence of cloud-native data platforms running primarily on Google Cloud Platform (GCP). This role supports data systems that ingest, process, and serve large volumes of operational data from oilfield and energy environments. The ideal candidate is a cloud-first SRE with deep GCP experience, strong Python engineering skills, and a track record of leading reliability initiatives for data-intensive systems.

Lead SRE practices for GCP-based data platforms

Design and own SLIs, SLOs, error budgets, and reliability metrics

Build and maintain cloud-native observability (monitoring, logging, alerting)

Lead incident response for production cloud systems and drive postmortems

Partner with data engineering and platform teams to design reliable architectures

Automate operational workflows using Python

Drive improvements in CI/CD, infrastructure as code, and deployment safety

Mentor engineers and set SRE best practices across the team

Required Knowledge, Skills, and Abilities:

7+ years in SRE, Cloud Platform Engineering, or DevOps

Strong hands-on experience with Google Cloud Platform, including:

GCP: GKE, Compute Engine, Cloud Storage, Pub/Sub (or equivalents)

Cloud Monitoring & Logging

BigQuery

Dataflow

Datastream

IAM and networking

Composer/AIrflow

Kubernetes: deployment, scaling, reliability patterns

CI/CD: GitHub Actions, GitLab CI, or similar

Observability: GCP Cloud Monitoring, Logging

Experience supporting cloud-native data systems (batch and streaming)

Production experience with Python for automation, tooling, or services

Infrastructure as Code experience (Terraform strongly preferred)

Experience operating systems in 24/7 production environments

Minimum Qualifications:

Bachelor's degree in Business, Information Technology, Computer Science, or a related field.

5+ years experience in Site Reliability Engineering, Cloud Platform Engineering, or DevOps

3+ years operating production workloads on Google Cloud Platform (GCP)

Prior technical leadership experience (lead engineer, tech lead, or ownership of reliability initiatives)

Ability to understand and speak English at a level of proficiency allowing employee to issue, receive and respond to both safety and operations-related directions in English

Preferred Qualifications:

Oil and Gas Industry knowledge

Technology/Digital Industry knowledge

The Evolving Oil Field Demands Evolving Service Providers NexTier is a leading provider of integrated completions that employs sustainable practices and equipment to support our customers' ESG goals while accelerating production in the most demanding US land basins.


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search