SAVIOR.EXE: A general solution for backing up DOS saves

ThumbKnock
Posts: 6
Joined: Thu Apr 04, 2024 10:18 pm
Has thanked: 3 times
Been thanked: 2 times

SAVIOR.EXE: A general solution for backing up DOS saves

Unread post by ThumbKnock »

Work in Progress

This is a design document specifying the functionality of a program that is not finished. I've implemented a lot of the functionality described below, but not all of it, and there are probably a lot of bugs. I can make an alpha version available if anyone is interested in trying it out, but I wouldn't recommend distributing it with a game yet.

Introduction

I've been working on a general, automatic solution to the problem of backing up and restoring save and config files for 0MHz-style VHD games for the AO486 MiSTer FPGA core. Here's a summary of how it works.

A DOS program called SAVIOR.EXE is added to AUTOEXEC.BAT or RUNGAME.BAT (along with MISTERFS.EXE), both before and after the game runs.

The first time you boot a VHD, this program makes a list of all the files in the VHD, along with their CRC-32 hashes. It stores this list as MANIFEST.VIO in the VHD. It then checks for previously backed-up save files and config files on a MiSTerFS drive and attempts to restore them.

Then, whenever you boot that same VHD and whenever you exit the game, SAVIOR.EXE searches the VHD for files not listed in the manifest and files with CRC-32 hashes that differ from those in the manifest, and it backs them up to a MiSTerFS drive.

Details

Usage Example

In AUTOEXEC.BAT:

Code: Select all

MISTERFS.EXE S /q
SAVIOR.EXE \ S:\SAVIOR\KEEN1
MISTERFS.EXE /u /q
CD KEEN1
CALL RUNGAME.BAT
CD ..
MISTERFS.EXE S /q
SAVIOR.EXE \ S:\SAVIOR\KEEN1
MISTERFS.EXE /u /q

Or in RUNGAME.BAT:

Code: Select all

MISTERFS.EXE S /q
SAVIOR.EXE \ S:\SAVIOR\KEEN1
MISTERFS.EXE /u /q
KEEN1.EXE
MISTERFS.EXE S /q
SAVIOR.EXE \ S:\SAVIOR\KEEN1
MISTERFS.EXE /u /q

This will cause SAVIOR.EXE to scan the VHD's root directory \ and create MANIFEST.VIO in the same directory, or load MANIFEST.VIO if it already exists. It will save backups to (and potentially restore them from) S:\SAVIOR\KEEN1.

Before and After

Why does SAVIOR.EXE run before the game starts and after the game exits?

  • It runs before the game so that it can restore any existing backups before you start playing. (So if you've downloaded a new VHD of the game, you can pick up where you left off.) Also, running before the game means your save and config files can get backed up eventually, even if you never cleanly exit the game.

  • It runs after the game so that save and config files are backed up as soon as possible after they're written. If you cleanly exit the game, you shouldn't have to start it again just to back up your data.

Backing Up Files

Save Files

A save file is a file that did not exist when the manifest was created. Save files are always backed up.

Configuration Files and Self-Modifying Executables

Some VHDs include files that are necessary for the game to function but can be modified by the game. The most common examples of this are probably configuration files, but there are others, such as Starflight's self-modifying executables. This is why SAVIOR.EXE checks CRC-32 hashes instead of just checking for files not listed in the manifest. By checking every file for changes, it can identify all the files that need to be backed up in order to restore the game's state.

SAVIOR.EXE could use modification dates and file sizes instead, but it's possible for the contents of a file to change without the date or size changing, and vice-versa. For example, if a new VHD is released for a game, all of its files might have more recent modification dates than they do in the older VHD, even though they're the same files. (To be fair, CRC-32 hashes aren't 100% reliable either, but the probability of failure is about one in four billion.)

Archive Bit

In FAT filesystems, each file has an "archive bit" that can be set or unset. DOS sets a file's archive bit whenever a file is written to. Backup programs can clear a file's archive bit after making a backup copy of that file. The archive bit thus allows backup programs to detect files that have been changed since the last backup.

When SAVIOR.EXE backs up a file, or when it determines that a file's CRC-32 hash matches the hash in the manifest, it clears the file's archive bit.

If you run SAVIOR.EXE with the /a option, it will skip files with unset archive bits when determining whether a file needs to be backed up. This means it won't bother computing their CRC-32 hashes, which will save time.

It should be safe to use the /a option in most cases, but if a game messes with its own archive bits for some reason, then this option might cause SAVIOR.EXE to skip a file that should be backed up.

VHD authors who use the /a option might want to clear the archive bit of all the files in their VHDs before distributing them. This will speed up the initial boot if they also bundle a manifest.

Restoring Files

When SAVIOR.EXE runs, it checks to see if it should restore files. There are two conditions that will cause it to restore files.

  1. MANIFEST.VIO doesn't exist because the VHD hasn't been booted before. In this case, SAVIOR.EXE will first create a manifest, then attempt to restore files.

  2. The manifest exists (perhaps because the VHD author included one), but no files in the VHD are found to be new or modified. This would indicate a clean install of a game, and if you have files backed up, you'd probably want to restore them.

Resolving Conflicts

If a clean (freshly downloaded) VHD contains a configuration file, and the game modifies it at some point, and the modified version gets backed up, what should happen if the first VHD is replaced with a second clean VHD that includes a different version of the same config file? While SAVIOR.EXE is attempting to restore backed-up files, it will detect such a conflict and ask you whether you really want to restore your backup, which is based on an older version of the file.

When SAVIOR.EXE saves a backup, it also stores the CRC-32 hash of the original file (i.e., its hash from the manifest). That way, it can tell whether your backed-up copy is a customized version of the same file, which would not be a conflict, or a customized version of a different file, which would be a conflict.

If you choose to restore your config file, the backed-up CRC-32 hash (i.e., the one that was copied from the first VHD's manifest) and the filename are saved to INJECTED.VIO. Entries in this file act like entries in MANIFEST.VIO but supersede them. This means subsequent backups of the file will retain the original CRC-32 hash from the first VHD's manifest, so you will be prompted to upgrade again if your replace the VHD again (even if the file in the third VHD is the same as the file in the second VHD).

Injecting Files

If you want to force SAVIOR.EXE to restore a file, you can inject it into the VHD by placing it in a special MiSTerFS directory (e.g. S:\SAVIOR\MYGAME\INJECT). The file will be moved into place in the VHD next time it's booted. Its hash (computed upon injection) and name will be stored in INJECTED.VIO.

One interesting way this could be used is to inject alternate config files, such as for MT-32 support. Instead of distributing two whole copies of a game with different sound settings, a VHD author could package one with Sound Blaster support enabled, and another that contains only an MT-32 version of the config file that unzips to games/AO486/shared/SAVIOR/MYGAME/INJECT/SOUND.CFG.

MT-32 users would just have to download and unzip both files. SAVIOR.EXE would move the MT-32 config file into place the first time it ran.

Built-In Manifests

VHD authors may include a manifest in their VHD images. You can force SAVIOR.EXE to generate a manifest (and do nothing else) by running it without a destination path.

Manifests can contain extra information about each file, such as a flag that means "this is known to be a static file, not a save or config file that we might want to back up, so don't bother computing its hash." Using this flag will make SAVIOR.EXE run more quickly. But this is just an optimization step and isn't mandatory.

Comments are also allowed in the manifest file. VHD authors can use comments to explain what various files are for. (Example: if the VHD contains a trainer, that file can be labeled as a custom addition to the VHD.) Any part of a line that follows a semicolon is considered a comment.

Shared Directory Layout

The MiSTerFS shared directory for a particular game is specified as the second path passed to SAVIOR.EXE (e.g. S:\SAVIOR\KEEN1). Within that directory, you'll find these subdirectories:

  • BACKUP (used for storing the config and save files themselves)

  • CRC (used for storing the CRC-32 hashes copied from the manifest)

  • INJECT (files placed here will be injected into the VHD next time it's booted)

Each subdirectory internally uses the same tree structure, which mirrors the structure of the VHD. So all of the following files would correspond to each other:

  • C:\DATA\CONFIG.DAT (the config file in the VHD image)

  • S:\SAVIOR\MYGAME\BACKUP\DATA\CONFIG.DAT (backup of the config file itself)

  • S:\SAVIOR\MYGAME\CRC\DATA\CONFIG.DAT (stores this file's CRC-32 copied from the manifest)

  • S:\SAVIOR\MYGAME\INJECT\DATA\CONFIG.DAT (will overwrite C:\DATA\CONFIG.DAT upon next boot)

Building

This application was developed using the Open Watcom 1.9 C compiler. Use WMAKE.EXE to build it.

Limitations

SAVIOR.EXE runs in real mode so the executable can be smaller. This means it can currently only keep track of 3200 files due to RAM limitations.

If SAVIOR.EXE runs after the CPU has been throttled by SYSCTL.EXE, then computing the CRC-32 hashes of all the files on the VHD might take a long time. On the other hand, games that require a low CPU speed probably don't have a lot of big files.

A typical implementation of SAVIOR.EXE relies on MiSTerFS, which requires DOS 5.00 or greater.

SAVIOR.EXE only supports short (8.3) filenames.

CRC-32 is fast, which is good given the speed of the AO486 core, but it's not the best hashing algorithm. It will have more false matches than other algorithms would.

AmintaMister
Posts: 300
Joined: Thu Sep 16, 2021 10:54 pm
Has thanked: 780 times
Been thanked: 48 times

Re: SAVIOR.EXE: A general solution for backing up DOS saves

Unread post by AmintaMister »

This a fantastic document design and a reliable solution for Dos Games backups, I’m looking forward for the first implementation!

User avatar
mrchrister
Scripting Wizard
Posts: 271
Joined: Tue Mar 30, 2021 6:23 pm
Location: Canada
Has thanked: 23 times
Been thanked: 115 times

Re: SAVIOR.EXE: A general solution for backing up DOS saves

Unread post by mrchrister »

Great idea!
A DOS saves backup solution is much needed and I've been toying with an idea to transfer saves as well.
The big drawback of doing it on the DOS side is obviously speed of accessing all files and building the database.also having a limitation of 3200 files is not ideal but I'm not sure how often we would hit that limit.
If we'd back up saves on the Linux side of mister we might be able to streamline the process more.

  1. Create a file list of the contents of the VHD (only needs to be done once per VHD) or sort by newest files first
  2. Check if any files have been added
  3. Copy those files somewhere safe

Looking forward to checking out your solution once it's ready and of you run any roadblocks I'm happy to help with a Linux script to achieve the same thing

Bas
Top Contributor
Posts: 562
Joined: Fri Jan 22, 2021 4:36 pm
Has thanked: 74 times
Been thanked: 271 times

Re: SAVIOR.EXE: A general solution for backing up DOS saves

Unread post by Bas »

Dumb question maybe, but how is this conceptually different from using an already existing backup solution from the days of DOS? I love the idea at first glance, but it does not look all that different from existing backup tools to me. I must admit that I only just skimmed your design though. A potential orange flag I'm seeing is relying on accuracy of timekeeping in DOS, which may get you bitten by Y2K or inaccurate timekeeping due to wonky host system date/time. Having SAVIOR doing its own bookkeeping for order-in-time of files could maybe fix this. You don't need to know the exact time when a file was written, you just need to know which one was there first.. but that would then take X amount of time at first boot of a VHD.

I'm planning a feature for DOSContainer that would push/pull deltas (that get zipped) from the VHD like a sky fairy from Linux rather than interfere with the DOS environment itself. That's still pretty far off though, as I'm still working on an accurate self-contained FAT12/16 implementation to enable this at all.

ThumbKnock
Posts: 6
Joined: Thu Apr 04, 2024 10:18 pm
Has thanked: 3 times
Been thanked: 2 times

Re: SAVIOR.EXE: A general solution for backing up DOS saves

Unread post by ThumbKnock »

Thanks for the feedback!

mrchrister wrote: Mon Apr 29, 2024 1:12 am

The big drawback of doing it on the DOS side is obviously speed of accessing all files and building the database.

It's true that the AO486 core is comparatively slow. The md5sum Linux program running on MiSTer generates MD5 hashes about twelve times as fast as SAVIOR.EXE generates CRC-32 hashes: about 20 MB/sec vs 1.67 MB/sec. But if VHD authors use the /a option or use the manifest's skip flag for large files that obviously won't change (e.g., DOOM.WAD), then I don't think speed will really be an issue. Most of the files that actually need to be checked are small.

mrchrister wrote: Mon Apr 29, 2024 1:12 am

also having a limitation of 3200 files is not ideal but I'm not sure how often we would hit that limit.

I did build a version of SAVIOR.EXE that ran in protected mode and had a much larger hash table for file entries. It wasn't difficult, and it seemed to work, but it made the executable about twice as big. I guess that might be fine for some cases, so maybe there could be two versions of the program for different sizes of games. But I doubt any DOS games have over a thousand files? They tend to have a few dozen or less. CD images don't need to be checked, of course, so only floppy or HDD images would count.

mrchrister wrote: Mon Apr 29, 2024 1:12 am

If we'd back up saves on the Linux side of mister we might be able to streamline the process more.

Are you talking about a script that does a backup/restore on all the games at once, or would you make a script for each game that runs the backup/restore and the game?

I did consider doing all of this in Linux... But I'm not too familiar with the MiSTer version of Linux, and I didn't see a way to mount VHDs. (It doesn't come with guestfish/guestmount, for example.) If that's easy to fix, then maybe it would make sense to do it that way.

There might be pros and cons to each approach. Running the backups in DOS is slower, but it's all self-contained, and accessing the filesystem is easy because you're doing it from the inside. The backup/restore happens every time, no matter how you start the game. SAVIOR.EXE even works in DOSBox if you mount a directory at the drive letter that SAVIOR.EXE is configured to use. (Though AO486 VHDs don't work in all versions of DOSBox.)

On the other hand, you have to add an extra program and some batch file lines to every game you want to use it with, and MISTERFS.EXE doesn't work in older versions of DOS. I do like the idea of using period-appropriate DOS versions, and I saw that Bas is working on an image builder that will allow this. It'd be a shame to have a backup solution that only worked on newer games.

Maybe this is worth exploring some more.

Bas wrote: Mon Apr 29, 2024 2:38 pm

Dumb question maybe, but how is this conceptually different from using an already existing backup solution from the days of DOS? I love the idea at first glance, but it does not look all that different from existing backup tools to me. I must admit that I only just skimmed your design though.

No, that's a fair question. Halfway through this project, I remembered that the archive bit existed, and I thought, "Could my whole program be replaced by XCOPY?" :lol: But there are a couple of features that make SAVIOR.EXE special. First, a generic DOS backup program would back up all files with the archive bit set on its first run, so if VHD authors weren't careful to clear every file's archive bit, it would back up unnecessary files. Second, the way SAVIOR.EXE keeps track of a backed-up file's original CRC-32 hash means it can be smarter about restoring files. Sure, you could just always ask the user if they want to overwrite while restoring files, but the SAVIOR.EXE can avoid asking when it knows that the backed-up file doesn't conflict with the file on the VHD.

Bas wrote: Mon Apr 29, 2024 2:38 pm

A potential orange flag I'm seeing is relying on accuracy of timekeeping in DOS, which may get you bitten by Y2K or inaccurate timekeeping due to wonky host system date/time. Having SAVIOR doing its own bookkeeping for order-in-time of files could maybe fix this. You don't need to know the exact time when a file was written, you just need to know which one was there first.. but that would then take X amount of time at first boot of a VHD.

My program doesn't rely on time at all; it uses CRC-32 hashes (and optionally the archive bit) to figure out which files have and haven't changed.

Your "which one was there first" idea sounds interesting, but I'm not sure I understand. How would you determine which file was newer without reliable timestamps?

Bas wrote: Mon Apr 29, 2024 2:38 pm

I'm planning a feature for DOSContainer that would push/pull deltas (that get zipped) from the VHD like a sky fairy from Linux rather than interfere with the DOS environment itself. That's still pretty far off though, as I'm still working on an accurate self-contained FAT12/16 implementation to enable this at all.

DOSContainer seems cool! I've been meaning to check it out.

User avatar
mrchrister
Scripting Wizard
Posts: 271
Joined: Tue Mar 30, 2021 6:23 pm
Location: Canada
Has thanked: 23 times
Been thanked: 115 times

Re: SAVIOR.EXE: A general solution for backing up DOS saves

Unread post by mrchrister »

ThumbKnock wrote: Mon Apr 29, 2024 5:27 pm

Are you talking about a script that does a backup/restore on all the games at once, or would you make a script for each game that runs the backup/restore and the game?

I did consider doing all of this in Linux... But I'm not too familiar with the MiSTer version of Linux, and I didn't see a way to mount VHDs. (It doesn't come with guestfish/guestmount, for example.) If that's easy to fix, then maybe it would make sense to do it that way.

I was thinking all games at once, yes.
I haven't looked at it in detail but I recall that flynnsbits Top300 updater modifies VHDs directly from Linux:
https://raw.githubusercontent.com/flynn ... 00_Pack.sh

Update: I believe this is the actual script: https://github.com/flynnsbit/Top300_upd ... updater.sh

Bas
Top Contributor
Posts: 562
Joined: Fri Jan 22, 2021 4:36 pm
Has thanked: 74 times
Been thanked: 271 times

Re: SAVIOR.EXE: A general solution for backing up DOS saves

Unread post by Bas »

Mtools should work on MiSTer.

Bas
Top Contributor
Posts: 562
Joined: Fri Jan 22, 2021 4:36 pm
Has thanked: 74 times
Been thanked: 271 times

Re: SAVIOR.EXE: A general solution for backing up DOS saves

Unread post by Bas »

Oh and another thing. As long as you don't support things like long filenames and things like DR-DOS, there are lots of unused bytes in the directory entries FAT uses to store file metadata. You could try and pick a byte there to flag as your very own attribute or even fit a 32-bit timestamp in there. MS and IBM DOS won't be none the wiser until you hit version 7.

Post Reply