Senior Infrastructure Monitoring Engineer
About the Role:
We are seeking an experienced and detail-oriented Senior Infrastructure Monitoring Engineer to join our growing MSP team. In this client-facing role, you will be responsible for managing infrastructure monitoring, backup systems, and remote management tools across a diverse client base. You’ll play a critical role in keeping systems stable, secure, and backed up — delivering proactive support and technical leadership across a range of technologies.
Key Responsibilities:
* Oversee monitoring and alerting infrastructure across multiple client environments, ensuring alerts are efficiently triaged and addressed via ticketing systems
* Administer and support backup and disaster recovery solutions including Acronis, Datto, and Azure Backup, with regular validation and recovery testing.
* Configure, maintain, and optimize Remote Monitoring and Management (RMM) tools ConnectWise Asio including scripting, alert policies, patch management, and automation workflows.
* Tune monitoring platforms to reduce noise and increase issue visibility.
* Analyze logs, metrics, and performance data to proactively identify and remediate infrastructure issues.
* Collaborate closely with the secure networks, service desk, and technical teams for escalations and high-priority incident resolution.
* Build and maintain client-specific dashboards and reports reflecting system health, performance, and backup status.
* Document procedures, standards, and best practices for monitoring, backup, and RMM administration.
* Participate in on-call rotation and assist with after-hours incident response as needed.
Requirements:
* Familiarity with administering and configuring RMM platforms like ConnectWise Automate, N-able, or Kaseya.
* Experience managing monitoring tools in multi-tenant environments.
* Strong hands-on experience with backup solutions Acronis, Datto, and Azure Backup.
* Working knowledge of Windows Server and Linux, cloud platforms (especially Azure), and networking fundamentals.
* Scripting experience (PowerShell, Bash, or Python) for task automation and custom RMM scripting.
* Strong troubleshooting and communication skills, with the ability to balance multiple client environments.
Nice to Haves:
* Experience with PSA tools (e.g., ConnectWise Manage, Autotask).
* Certifications such as Microsoft Azure Administrator, CompTIA Server+/Network+, or vendor-specific RMM certifications.
* Understanding of compliance standards (HIPAA, SOC 2, GDPR) and how they relate to monitoring and backups.