Metrist
About Metrist
Shut down in late 2023, Metrist as a startup business existed to monitor the performance and availability of the apps software depends on. Ryan was co-founder and Chief Technology Officer, responsible for the product roadmap and technical direction. Dave was Director of Engineering, responsible for technology choices and management of most of the engineering team.
As a technology, Metrist has two major components. These were made open source in 2023.
Monitors
The monitors are automated testing tools designed to verify the functionality and performance of various cloud services, including Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) offerings. They perform a series of operations that mimic real-world usage, such as creating, reading, updating, and deleting resources specific to each service. By executing these operations and measuring their success and speed, the monitors can quickly detect any issues or performance degradation across a wide range of cloud platforms. This allows system administrators and DevOps teams to proactively identify and address problems before they impact end-users. The monitors are implemented in different programming languages to best suit each service, and they use standardized protocols for consistent reporting. Additionally, they include cleanup procedures to remove test resources, ensuring that the monitoring process doesn't leave behind unnecessary data or incur ongoing costs.
Source code is available on GitHub.
Backend and Web UI
The backend is a sophisticated monitoring and alerting system designed to track the performance and availability of various cloud services and APIs. It continuously collects telemetry data and error information from monitored services, as well as scraping official status pages of major cloud providers like AWS, Azure, and GCP. This data is processed in real-time to maintain up-to-date snapshots of service statuses. When significant changes or issues are detected, the system generates alerts and dispatches them through various channels such as Slack, email, and webhooks, based on user subscriptions. The backend is built with scalability and fault-tolerance in mind, utilizing a distributed architecture that can operate across multiple servers. It employs event sourcing techniques to maintain a reliable record of all system events, allowing for robust data integrity and the ability to reconstruct system state if needed. The system also provides a range of interfaces for user interaction, including a Slack bot for easy status checks and subscription management. Additional features include Twitter monitoring for service-related hashtags, API token management for secure access, and comprehensive user and account management capabilities. All of these components work together to provide users with timely, accurate information about the health and performance of their critical cloud services and APIs.
Source code is available on GitHub.
Technologies
Conclusion
Metrist is just one of the businesses where Dave and Ryan worked together to build a brilliant engineering team and software products.