Site Reliability Engineer

PulsePoint
Argentina, Bolivia, Brasil, Chile, Colombia, Costa Rica, Cuba, República Dominicana, Ecuador, Guatemala, Honduras, México, Nicaragua, Panamá, Perú, Puerto Rico, Paraguay, Uruguay

Descripción

Job Opportunity: Senior Linux Systems Engineer
Location:
Open to candidates in Latin America, India, and EU/EE, provided you can work East Coast U.S. hours.
Responsibilities:
Ensure the reliability and scalability of our multi-datacenter and hybrid Linux environments.
Manage large-scale Linux infrastructure to maximize uptime.
Conduct performance and reliability testing, including reviewing configurations, software choices/versions, and hardware specifications.
Innovate and enhance our technology stack with creative solutions.
Participate in capacity management for core systems and services, application analysis, and performance and security tuning.
Provide operational support for systems and build automation to address root causes, aiming to automate responses to all non-exceptional service conditions.
Develop strategies for long-term permanent solutions to critical production incidents.
Maintain documentation, build tools, and create alerts to identify and resolve infrastructure reliability issues.
Proactively identify system anomalies.
Collaborate with the security team on new initiatives and ongoing changes.
Who You Are:
Collaboration is in your DNA. You understand that when the team succeeds, you succeed.
You are eager to grow your skills and learn new technologies, sharing your insights with the team.
You appreciate both big-picture perspectives and fine details, capable of strategic thinking and diving into complex systems.
You are a proactive problem solver, driven to fix unreliable infrastructure.
You stay updated with security best practices and implement them consistently.
Requirements:
At least 4 years of relevant experience.
Strong understanding of Linux (CentOS and Rocky Linux in production).
Deep knowledge of the Puppet stack (roles & profiles, Hiera, PuppetDB).
Experience with Foreman.
Proficient with git and able to resolve merge conflicts.
Experience with Jenkins CI.
Experience managing SQL/NoSQL databases (MySQL, PostgreSQL, MongoDB, ES, Redis, Memcached).
Ability to work with Cassandra database clusters from installation to maintenance.
Experience with scalable infrastructure monitoring solutions such as Icinga, Prometheus, ELK, Graphite.
Strong scripting and automation skills (Ruby, Python, Bash).
Understanding of networking concepts (TCP/IP stack, DNS, PKI, CDN, load balancing).
Experience with on-prem/bare metal server operations.
Knowledge of virtualization solutions like KVM.
Storage configuration experience (Netapp, EMC).
Experience with container technologies (Docker, Containerd).
Diverse experience with IT security best practices in the SRE context.
Willingness and ability to work East Coast U.S. hours (9 am-6 pm EST).
Bonus Skills (Not Required):
Knowledge of Kubernetes (K8s) and its ecosystem.
Experience in AdTech or High-Frequency Trading.
Hands-on experience with cloud platforms (AWS, GCP).
If you meet the qualifications and are ready for a new challenge, we would love to hear from you!

Información

Otras Vacantes

Estas vacantes tienen roles y ubicación similares.

Apertus

Especialista Networking + CloudOps

23 may
Tiempo Completo
Ingeniero de Infraestructura
Híbrido 🇦🇷

ArbeitCo

SysAdmin Tech Support

8 abr
Tiempo Completo
Ingeniero de Infraestructura
Remoto ( 20 ubicaciones )

Lumina Consultora

Lider de infraestructura Devops

11 mar
Tiempo Completo
Ingeniero de Infraestructura
Remoto 🇦🇷

Site Reliability Engineer

Descripción

Información

Empresa:

Tipo de Empleo:

Modalidad de Empleo:

Ubicación:

Rol:

Experiencia:

Fecha de Publicación:

Comparte esta Vacante:

Otras Vacantes

Para talentos de IT

Para reclutadores

Sobre Nosotros