4 min read

The importance of being in control over data and having a healthy database environment

Featured Image

It can be easy to forget important things when we're busy. In the realm of business, one of the most essential things is data. It is crucial to remember the importance of regularly monitoring the database and maintaining the health of its environment. It is recommended to keep a close eye on this and conduct regular check-ups to ensure effective monitoring.

New data is constantly being written to the database tables, causing them to grow larger and larger. If you don't control their growth, it can lead to performance issues over time, such as longer loading times for users, scripts, API requests, and robots. The growth of data also affects other resources like CPU, memory, and hard drives.

 

A Health Check on the Database

As the tables in the databases grow larger and larger, the importance of controlling size and growth rate increases. For table-size and the number of rows, it’s easy to follow the growth by, for example, checking DATA_LENGTH and TABLE_ROWS in INFORMATION_SCHEMA.TABLES. This will give you a feel for what the current situation is and whether there are any anomalies. For example, is it reasonable to have close to 100 million rows in one table? Can the size be argued for by saying that the data is actually used, or should a cleansing be considered? It is even more interesting to consider how reasonable the number of rows is in a table with frequently used data. Doing so, you will obtain good information about how large tables affect performance with MySQL’s slow-query-logger (more about this further down). Regarding the growth rate, you can make graphs with help from extracted data from the above table at regular time intervals. This way you’ll get a feel for whether the growth rate increases when e.g. new functionality is introduced.

To ensure the health of the database tables, it’s also good to verify that, e.g., CHECK TABLE is actually performed on the tables. You certainly want to avoid taking backups on corrupt/broken tables.

If you find this tricky, feel free to discuss it with us at Netadmin. We’re careful to ensure that your system's database health is good – both for functional purposes and user experience!

 

Health Checks from an Environmental and Performance Perspective

Apart from the database itself, it’s also important that the database operates in an environment where surrounding factors and resources are healthy and regularly monitored. Under-dimensioned resources can otherwise affect the performance. Some examples of things that could be sensible to monitor are:

  • CPU usage/Average Load: It’s interesting to watch the trends for CPU/loadavg to determine if the available CPU performance is enough for the growing data handled by the system.

  • Hard drive usage: Monitoring the hard drive usage is certainly very important; when the disc storage is full, many functions, both in applications and the OS, likely start having issues operating normally.

  • Hard drive I/O: This depends a little bit on the hard drive set-up. Can the hard drive keep up with reading and writing or barely complete the task? It may also be interesting to see what SMART says about the hard drives.

  • Memory usage: How does memory usage look when the most memory-demanding functions are running? Is it borderline?

  • Network usage: This may also be a point of interest. Monitor it to discover anomalies in transferred data per time unit (especially when very little data is being transferred). This could be very useful information when troubleshooting.

  • IIS: How’s the response time looking? Are there requests that have been hanging for unreasonably long and must be checked? Do these requests have a direct connection to the related issues in the database? Perhaps during certain hours of the day?

 

The Performance Perspective

Large database tables, especially those frequently used, can have a big effect on the general experience of the system’s performance. As mentioned in the previous subheading, it’s definitely a good idea to keep the tables to a healthy size for precautionary purposes. However, it could be difficult to know where the line is drawn at times – how many rows/GB are reasonable? You monitor the table locks; this could help you figure out which tables need overseeing and perhaps a bit of cleansing to attain better performance. Locks for this type of table are normal for MySQL, but there are limits to what’s reasonable concerning locking times and waiting times. In order to have a good overview of the lock situation, you can choose from different methods, depending on what level you choose. Some alternatives are:

  • You can create your own surveillance with, for example, “SHOW FULL PROCESSLIST” and handle the results as you wish.

  • Fetch the global status value for “Slow_queries” (which is a counter) with a periodic schedule for seeing trends and tops (for example, per hour or day that exceeds specified “long_query_time”).

  • Activate MySQL’s built-in slow-query-logger, which is normally set to log locks over 10 seconds. Remember that this logger can also, to a certain extent and, depending on the queries, affect the performance when activated.

If you would like a good compilation of MySQL’s slow-query-log, you can take a look at Percona’s “pt-query-digest," which gives you a good overview of the worst performance thieves with a point system. If you’re viewing the log and you notice that there are locks every day at 2 PM, it may be possible to trace the lock to a customer-specific scheduled task/job. If possible, scheduling jobs for out-of-office hours is better to minimize the effect on system users. In cases where it’s not possible to reschedule jobs, it could be time to oversee the size of the tables that’s causing the long locks. When you’ve analyzed the tables and you have determined that there are no more data to sort away, there are some options to consider:

  • Can the functionality that’s causing the heavy database queries be rewritten or optimized and adjusted to the present or future datasets?

  • Can the locks be traced to non-product related tables? In that case, an overseeing of the indexes could be helpful.

  • Is it time to consider a Master/Slave set-up for the database? If the function behind the locks can be redirected to a Slave database, this will have a big and positive effect on the Master database. Another huge advantage with the M/S set-up is that you can complete a database backup without affecting the production environment. On the other hand, a M/S set-up requires some maintenance, but the advantages often outweigh the disadvantages if used in the right way.

 

Conclusion

The purpose of this article is to remind you of how important it is that your data and surrounding environment stay healthy, as it is such a crucial part of your business. This includes that the system remains responsive with healthy datasets, but also precautions for possible future issues.

Perhaps you don’t need to begin working on all the different controls simultaneously, but it is better to get going with the most important ones if you aren’t monitoring much today. Some monitoring and scheduled/regular controls are better than none at all. A strong recommendation is also to do more than just monitor – it’s obviously a great advantage if you receive alarms on exceeded specified thresholds as well! Regularly verify that the monitoring and alarms truly work.

An Application Management commitment at Netadmin can help you with much of the above, as well as many other important improving and preventative precautions. Please contact us at Netadmin if you’re interested in receiving help with the things described here, amongst other related things.

 

 

Get started today
Start building your fiber network with Netadmin Nine.

Explore Netadmin