Site Reliability Engineer

  • PulsePoint
  • Argentina, Bolivia, Brasil, Chile, Colombia, Costa Rica, Cuba, República Dominicana, Ecuador, Guatemala, Honduras, México, Nicaragua, Panamá, Perú, Puerto Rico, Paraguay, Uruguay
Descripción

Job Opportunity: Senior Linux Systems Engineer

Location:

Open to candidates in Latin America, India, and EU/EE, provided you can work East Coast U.S. hours.

Responsibilities:

  • Ensure the reliability and scalability of our multi-datacenter and hybrid Linux environments.
  • Manage large-scale Linux infrastructure to maximize uptime.
  • Conduct performance and reliability testing, including reviewing configurations, software choices/versions, and hardware specifications.
  • Innovate and enhance our technology stack with creative solutions.
  • Participate in capacity management for core systems and services, application analysis, and performance and security tuning.
  • Provide operational support for systems and build automation to address root causes, aiming to automate responses to all non-exceptional service conditions.
  • Develop strategies for long-term permanent solutions to critical production incidents.
  • Maintain documentation, build tools, and create alerts to identify and resolve infrastructure reliability issues.
  • Proactively identify system anomalies.
  • Collaborate with the security team on new initiatives and ongoing changes.

Who You Are:

  • Collaboration is in your DNA. You understand that when the team succeeds, you succeed.
  • You are eager to grow your skills and learn new technologies, sharing your insights with the team.
  • You appreciate both big-picture perspectives and fine details, capable of strategic thinking and diving into complex systems.
  • You are a proactive problem solver, driven to fix unreliable infrastructure.
  • You stay updated with security best practices and implement them consistently.

Requirements:

  • At least 4 years of relevant experience.
  • Strong understanding of Linux (CentOS and Rocky Linux in production).
  • Deep knowledge of the Puppet stack (roles & profiles, Hiera, PuppetDB).
  • Experience with Foreman.
  • Proficient with git and able to resolve merge conflicts.
  • Experience with Jenkins CI.
  • Experience managing SQL/NoSQL databases (MySQL, PostgreSQL, MongoDB, ES, Redis, Memcached).
  • Ability to work with Cassandra database clusters from installation to maintenance.
  • Experience with scalable infrastructure monitoring solutions such as Icinga, Prometheus, ELK, Graphite.
  • Strong scripting and automation skills (Ruby, Python, Bash).
  • Understanding of networking concepts (TCP/IP stack, DNS, PKI, CDN, load balancing).
  • Experience with on-prem/bare metal server operations.
  • Knowledge of virtualization solutions like KVM.
  • Storage configuration experience (Netapp, EMC).
  • Experience with container technologies (Docker, Containerd).
  • Diverse experience with IT security best practices in the SRE context.
  • Willingness and ability to work East Coast U.S. hours (9 am-6 pm EST).
  • Bonus Skills (Not Required):
  • Knowledge of Kubernetes (K8s) and its ecosystem.
  • Experience in AdTech or High-Frequency Trading.
  • Hands-on experience with cloud platforms (AWS, GCP).

If you meet the qualifications and are ready for a new challenge, we would love to hear from you!

Otras Vacantes

Estas vacantes tienen roles y ubicación similares.

The Credit Pros
Cyber Security Engineer

  • 25 nov
  • Tiempo Completo
  • Ingeniero de Infraestructura
  • Remoto 🇦🇷
Interlaced
IT Support Associate

  • 31 oct
  • Tiempo Completo
  • Ingeniero de Infraestructura
  • Remoto 🇦🇷
Megatech
Administrador Linux

  • 2 oct
  • Tiempo Completo
  • Otro
  • Ingeniero de Infraestructura
  • Remoto 🇦🇷