Email isn’t the only automation candidate. As we already mentioned, anything that we do periodically and consistently should make us think about automation. We just need to identify which these things are in our environment. For instance, if our website has hundreds of images that need to be optimized, it is not scalable to optimize them one by one. On the other side, optimizing only a couple of images allows us to easily compare different services and the compression ratios they achieve. Then we may choose to use the most optimal solution where it matters, even when the process cannot be automated.
If there is a high probability that a server component would fail, we could take precautions and install some redundancy, whose cost corresponds to the frequency and severity of the problem. Once this component eventually fails, the additional one would take over its functions automatically with zero downtime for the clients having data on that server.
Another example are website usage statistics. Few are the businesses which aren’t constantly checking their numbers and using them to adjust their strategies. This leads to the danger of giving too much attention to the numbers and reduced attention to our actions that actually define them. Even though some analysis is important, doing too much of it can be paralyzing. The first step is to recognize how often we are doing this. If we check the statistics four times a day and it takes us 20mins in total, then this is probably an automation candidate. Then we could try to understand what we are looking at in terms of the data. Probably a program that in order to plot visually appealing charts has to take its data from somewhere. The “no program without data” idea leads us to the access logs. We can observe what they contain and seek where the most important data (in our case) appears to be and how it is formatted. Each access log is accessible through FTP where we could type our credentials once and connect/disconnect as often as we need by means of executing code and not by means of clicking on GUI controls. After the connection is established, we could retrieve the files in binary mode, copy their contents to separate files and close the connection. A separate script can then be used to parse and extract the data we need, using regular expressions. Then it can also plot this data, so that for each logging period separate time series for each variable of interest are created. Finally, another script can be responsible for the presentation of these data plots, being accessible through a simple bookmark from the browser. This script can also do some simple sorting and filtering, so that for instance only plots having data from the last year are shown in reversed chronological order. Now, anytime we click on the bookmark, the log file for the last month will be updated and the plot we’ll see will be updated as well. All without having to login to a control panel, without seeking the icon of the log analysis software, without having to choose a subdomain for which we want to obtain statistical data, without having to dig deeply into potentially irrelevant data, without having to stare at a design we are not happy with. The automated solution gave us the flexibility to organize and present the data in the most sensible way for our concrete case. What it can’t do however is to generate different sources of data that aren’t present in the logs. For that we would need to write our own statistical software. The result can look something like this:
Using unit tests to automate software testing is another area where a lot can be gained. But this often requires learning more libraries, tools and thought patterns. Sometimes the cost of automation can be too high, so we should not blindly assume that just because it is possible, it will be good in any situation.