In the late '70s floppy drives were too expensive for most personal computer owners, so we stored our programs and data on audio cassettes like this one:
Thirty years later, as part of a nostalgic streak, I unearthed my old cassettes (including the above “D” one) and tried to get them read on a modern TRS-80 emulator. Many didn't read at all, and after inspecting the audio, I found various damage like lost pulses, overlaid hum, and muted areas. The emulators try to faithfully reproduce the hardware's behavior, which is fine for clean cassettes, but if I wanted to read these, I had to take a different approach.
The audio format is a sequence of pulses:
I've documented the format here.
I first wrote a Java program to pick through the audio and try to guess where the pulses were. This worked fine, but whenever the program failed to properly detect a pulse, I had no easy way to visualize how it had failed.
Not for the first time, I remember that I should default to writing a web app. I rewrote it all in TypeScript and included interactive visualizations, a TRS-80 emulator (to run the decoded program), a Basic decoder, and a disassembler.
One of the fun features is that the selection is synchronized across the various views. You can click on a byte in the hexdump, a keyword in the Basic program, a line in the disassembly, or a pulse in the audio view, and the other views will show the corresponding data. This is useful when the Basic program gets corrupted at some point. You can click the first corrupted keyword to jump to that section of the audio.
The program aggressively tries to decode the correct pulses, filtering out buzzing, making various educated guesses, adapting to changes in volume, and searching around if necessary. I'm pretty happy with it! It's really only failed on audio which isn't even decipherable by human looking at the audio.
The regression suite draws pulses with annotations:
and explains in English how it made each of its decoding decisions:
Looked for pulse at 891 and found it at 892, which is within the search radius of 26. Range 20,640 is greater than pulse threshold 3496, start -130 < 12,234, and end 145 < 12,234. Range 375 is less than or equal to noise threshold 1748.
Once decoded, the program can be exported in various formats, including as a Basic listing and as an industry-standard CAS file format, which most emulators accept.
Try it now in your browser. You can click the “C-1-1” button to try the first program on my old “C” tape.
The source code is available on GitHub.