This Learn On Demand Pro Series is part of a Career Path: Become a System Administrator
The learning objective of this virtual lab is to show learners how to protect their data on Linux through file archiving on the Linux command line. This lab introduces them to the “tar” and m5sum commands. Tar, which stands for “tape archive,” is both a Linux command and a file format. The tar command creates an archive usually referred to as a “tarball.” The md5sum command is a checksum utility that prints a 32-character (128-bit) checksum of a given file, using the MD5 algorithm.
Understand the scenario: In this hands-on lab, you will take on the role of a Linux system administrator responsible for the content of a file server. You will use the tar utility to back up and compress files. You will also use the md5sum utility to verify the integrity of the file archive.
Understand the environment: You will use a default installation of CentOS7 Linux with the Server with GUI package installed. Non-privileged accounts will be created for you. You will be guided through the process of adding software if necessary.
Use tar and gzip to back up and compress files:
To backup and compress files, you will:
- Use the command line to create several working files and directories.
- Use tar and gzip to create a backup archive of these files.
- Use tar to append files to this archive.
- Use various tar options to create a second type of archive.
- Verify that the file size of these archives is smaller than the original files.
Check the integrity of the backup job using md5sum:
To verify the file archive integrity, you will:
- Learn about the md5sum utility.
- Use the md5sum command to verify the integrity of the two archives and ensure none of the archived files are corrupted.
- Use the md5sum command to verify that the two archives contain the same data.
Restore data using tar:
You will restore the files archived in step one using tar and gzip by:
- Deleting the files you created.
- Using tar and gzip to unpack the archive.
- Using md5sum to validate the restored files.
After completing the “Back Up and Restore Files Using Tar in Linux” challenge, you will have accomplished the following:
- Bundled files using tar.
- Compressed the tarball using tar and gzip.
- Verified the integrity of the archive using md5sum.
- Restored files from a backup.
- Compared the original and backup files using md5sum.
Understanding the tar utility and how to backup and restore your files is important for any IT career. Some of the ways you will use tar in your everyday job are:
- Backup and restore important files, both yours and your users.
- Save important space on your servers.
- Transfer files between disks easily and efficiently.
- Manage your users’ files.
- Manage the storage on your local disks and on your servers.
Tar is the most commonly used file archiver in Linux, so it is important for you to be comfortable with this utility. Tar archives created with the tar command contain various file system parameters, such as name, time stamps, ownership, file access permissions, and directory organization. This makes tar a powerful tool, as it not only backs up your files but also backs up important metadata about your files. Making the tar command even more powerful is its ability to archive to many file formats, providing flexibility to meet your backup needs. Tar not only backs up your files, but backs them up in a highly-compressed way that saves you space on your servers and makes moving your files between disks simple and efficient.
Md5sum is used to verify the integrity of a file. Because backing up and restoring files can occasionally lead to corrupted files, you should always verify the validity of both the tar archive and the restored files. By printing the checksum of the file, you can ensure the hash values are valid. When comparing two identical files, i.e. the original file and the restored-from-backup file, the checksums should be identical. If they are not, then there are differences between the two files, and their integrity cannot be confirmed. In this way, md5sum can be used to verify the integrity of a file copy or backup restoration. This verification is a critical step in backing up and restoring your data and your users’ data.
Other Challenges in this Series:
- GUIDED CHALLENGE - Backup and Restore Files with cpio in Linux
- ADVANCED CHALLENGE - Can You Inventory Storage Space in Linux?