Server maintenance

Server maintenance involves keeping a server updated and running so that a computer network can operate smoothly. Properly maintaining a server is usually the task of a network administrator, and it is vital to the performance of the network. If maintenance is not conducted on a computer network, regardless of whether the network is small or large, application software usually will not run as well as expected. In some cases, a network may even experience total or partial network failure if proper maintenance is not conducted.

Maintaining a server requires a network administrator to conduct preventive maintenance. Essentially, this means that the administrator must review the server’s performance as well as any potential security risks and backup protocols at regular intervals. As part of this, he or she typically ensures that automated system monitoring utilities are installed and appropriately configured. These utilities often come with the server’s hardware package.

Even with automated utilities in place, a solid server maintenance plan requires several basic steps. First, the administrator usually conducts a thorough examination of the network, including checking server log files, hard disk space, folder permissions, and redundancy. This review typically also includes monitoring network temperature applications to ensure the server does not get too hot. If a machine overheats, vital equipment like the central processing unit, memory, or motherboard can be affected.

Server Maintenance Tips

1. Verify your backups are working. Before making any changes to your production system, be sure that your backups are working. You may even want to run some test recoveries if you are going to delete critical data. While focused on backups, you may want to make sure you have selected the right backup location.

2. Check disk usage. Don’t use your production system as an archival system. Delete old logs, emails, and software versions no longer used. Keeping your system free of old software limits security issues. A smaller data footprint means faster recovery should a disk fail. If your usage is exceeding 90% of disk capacity, either reduce usage or add more storage. If your partition reaches 100%, your server may stop responding, database tables can corrupt and day can be lost.

3. Check RAID Alarms. If you are using RAID (and you should be), check that your RAID’s error notification system is configured properly and works as expected. Most RAID levels tolerate only a single disk failure. If you miss a RAID notification, a simple disk replacement could turn into a catastrophic failure.

4. Update your OS. Updates for Linux systems are release almost daily. Many of these fix important security issues. At rackAID, we update systems daily (sometimes even more frequently). If you do not have a management service or auto-updates enabled, be sure to review your OS for any critical security updates. Get on the mailing list for your OS so you know when critical security patches are released. If you have a kernel update, you will need to reboot your server unless you use a took like Ksplice.

5. Update your Control Panel. If you are using a hosting or server control panel, be sure to update it as well. Sometimes this means updating not only the control panel itself, but also software it controls. For example, with WHM/cPanel, you must manually update PHP versions to fix known issues. Simply updating the control panel does not also update the underlying Apache and PHP versions used by your OS.

Leave a comment