As the latest trend, many are jumping on the DevOps bandwagon. Job listings for DevOps Engineers and DevOps Directors are a dime a dozen. The problem is, DevOps isn’t a job. It’s a philosophy – a way to do business. And, contrary to popular belief, DevOps is not new. As a way of thinking and doing business, DevOps has been used by progressive organizations for years.
Done right, DevOps, breaks down the traditional silos typically associated with IT. Instead of sequestering your systems admins in the basement with the servers, it gets them collaborating with other departments. DevOps brings development in line with operations in the same way that agile brings development in line with business. This requires operations to be part of the development cycle starting with the spec phase, where technologies are chosen, with the intention of delivering the most efficient system on which to run the application.
Benefits of DevOps include:
Better uptime percentages
Greater awareness of application demands and health
Closer collaboration and more transparency between departments
Monitoring and automation is one way DevOps will improve uptime. It will also be possible to accelerate time to issue / outage resolution by determining what metrics can predict issues and then alerting and proactively addressing them before an outage actually occurs.
System automation has been around even longer than distributed computing, with the ability to script the installation of an application, configuration of files, or patches, or <insert your favorite repeatable task here>. Scheduling with cron or at has allowed these tasks to happen, unattended, at any hour of the day. As long as you put some sort of reporting into your script, you could get the results as soon as the task is finished, good, bad or otherwise.
Automation also eliminates the risk of human error. It’s amazing how common it is for a systems admin to mistakenly run a command (typically rm –rf *) and erase half a system, quickly needing to learn how to restore from a backup.
DevOps will also improve awareness and monitoring. Traditionally, most monitoring tools measured only system level metrics – CPU utilization, memory consumption, disk, network, etc. By monitoring specific metrics within your application itself (either through direct operations – response time for an API query, for instance, or through monitoring of log files), you’re able to uncover potential failures before they cause an outage. This approach will also allow you to tailor your on-call escalation policy to get people who can actually troubleshoot and solve the problem instead of following a runsheet, and then paging out the appropriate resources. This will lead to happier staff, knowing they aren’t going to get called in the middle of the night for something that they can’t help with, and ultimately, leads to peace of mind that issues will be handled more quickly by the people with the right expertise.
By combining automation technology with monitoring, you can focus on the exact areas where improvements can be made. For instance, it can allow you to size your instances just right, then add or remove them from a cluster based on any metric you choose (i.e. CPU utilization, response time to a particular operation, time of day, etc.). Not only does this keep your application running optimally, it also allows you to maximize ROI on your instances, so you’re only using the computing power that you need at any given time.
Collaboration & Transparency
By breaking down the traditional silos existing in IT (typically categorized as development, IT operations, and Q/A), many issues that have historically been dealt with reactively, can begin to be proactively addressed, and in some cases, completely avoided.
By involving your operations staff in the entire software life-cycle, instead of shoehorning your application into an already existing and rigid environment, both can be built together, giving your developers the ideal environment in which to run their application; and by giving the operations team total transparency into the technology requirements, they won’t end up having to upgrade a critical piece of infrastructure quickly before a deployment.
The biggest gain comes with total responsibility - instead of having the mentality of each team being responsible for their piece of the pie, giving everyone the responsibility for the entire application means everyone will be working towards the common goal versus just focusing on their part and signing off till the next project.
Challenges & Conclusion
Obviously, this all sounds great – but implementing DevOps isn’t an easy task. Breaking down barriers between groups that have traditionally worked separately takes time and collaboration. It will also require the leadership to accept responsibility for the entire application instead of one single component, which is daunting. Unfortunately, the best learning experiences come from failure, so you might have some growing pains while learning which metrics actually mean the application is going to have an issue. Ultimately, over time, you’ll get better at predicting issues and solving them before they happen, and you’ll be faster to resolution when issues do occur.
All in all, implementing DevOps is worthwhile, and while it definitely isn’t an easy task, the cost savings and peace of mind that come with a finely honed system more than makeup for the hardships along the way.