Staff Infrastructure Architect
Location:
Palo Alto, CA
Duration:
5-Month Contract
Pay Rate:$63.50–$68.50/hr
Position Overview
We are seeking a Staff Infrastructure Architect
to lead the design, implementation, and optimization of large-scale cloud infrastructure and reliability platforms. This role will focus on building scalable, secure, and highly reliable distributed systems
across hybrid and multi-cloud environments.
The ideal candidate will provide architectural leadership across cloud infrastructure, networking, automation, and SRE practices, while partnering with platform, security, and engineering teams to drive modernization and operational excellence.
Key Responsibilities
Cloud Architecture & Platform Engineering
- Architect reliable, scalable, and secure cloud systems across public, private, hybrid, and multi-cloud environments.
- Lead the design and optimization of cloud infrastructure platforms, including containerized environments and virtual machine infrastructure.
- Define infrastructure architecture standards to support modern microservices and distributed systems.
- Drive adoption of Infrastructure-as-Code (IaC) and GitOps practices for automated infrastructure delivery.
Governance, Security & Reliability
- Establish cloud governance frameworks, security standards, and cost optimization strategies.
- Partner with service owners to define SLIs, SLOs, and SLAs to ensure system reliability and performance.
- Champion security best practices across infrastructure, networking, and platform services.
- Lead incident management reviews, root cause analysis, and reliability improvements.
Automation, DevOps & Observability
- Reduce operational overhead through automation and platform engineering best practices.
- Define and promote CI/CD standards, governance models, and platform engineering workflows.
- Implement observability platforms using metrics, logging, and distributed tracing.
- Improve system performance and reliability through data-driven monitoring and analytics.
Cloud Networking & Infrastructure
- Architect and implement cloud networking solutions across multi-cloud and hybrid environments.
- Manage connectivity, routing, and networking infrastructure including VPCs, DNS, load balancing, VPNs, and inter-cloud connectivity.
- Design and maintain peering and transit infrastructure, routing policies, and cloud networking services.
Cross-Functional Collaboration & Leadership
- Partner with Cloud, Platform, Security, and AI/ML teams to deliver resilient infrastructure solutions.
- Mentor engineers and influence technical direction across infrastructure and platform teams.
- Support cloud migration, modernization initiatives, and enterprise platform evolution.
Required Qualifications
- 10+ years of experience in Cloud Architecture, Site Reliability Engineering, or Platform Engineering.
- 7+ years of hands-on experience with public cloud platforms, with strong experience in Google Cloud Platform (GCP).
- Proven expertise operating large-scale distributed infrastructure systems.
- Deep experience with container orchestration (Kubernetes), Docker, and microservices architectures.
- Strong background in cloud networking, including VPCs, DNS, routing, load balancing, VPNs, and security groups.
- Experience with service mesh technologies (e.g., Istio) including traffic management, observability, and security features.
Infrastructure Automation & DevOps
- Expertise with Infrastructure as Code tools such as Terraform, CloudFormation, AWS CDK, or CrossPlane.
- Experience implementing GitOps workflows using ArgoCD.
- Hands-on experience building and managing CI/CD pipelines (Jenkins, GitHub Actions, ArgoCD).
Observability & Reliability
- Experience implementing observability platforms using tools such as Prometheus, Grafana, Splunk, or New Relic.
- Strong understanding of OpenTelemetry or distributed tracing systems at scale.
Programming & Systems
- Strong programming experience in Python, Go, JavaScript, or TypeScript for automation and platform tooling.
- Advanced scripting with Bash and Python.
- Strong knowledge of Linux/Unix systems, networking protocols, and security frameworks.
Additional Requirements
- Experience designing and operating enterprise SaaS infrastructure in cloud environments.
- Experience supporting 24/7 operational environments, including incident response and post-incident analysis.
- Bachelor's degree in Computer Science, Engineering, Mathematics, Physics, or a related technical field (or equivalent experience).
- Strong communication and collaboration skills with the ability to influence cross-functional teams.
Preferred Experience
- Experience integrating AI-driven engineering tools and automation into infrastructure workflows.
- Exposure to large language model (LLM) based tooling for operational intelligence and developer productivity.
- Experience supporting AI/ML infrastructure platforms.
...