Dan Hockstein and Mari Allison are 2022 Junior Fellows in the Digital Collections Management and Services Division (DCMS) working under the mentorship of Kate Murray.
Over the course of our Junior Fellowship this summer, we have focused on a variety of streams of work around the Library of Congress’ Sustainability of Digital Formats website. The site contains an extensive list of commonly used file formats, wrappers, and encodings. There are thousands of these created by legacy equipment and software that present challenges in identification, preservation, and use. Among these is the file format produced by the now-defunct word processing platform WordStar.
WordStar’s History
The WordStar format is the default proprietary plain-text format for a word processing platform of the same name. The initial version of WordStar was first published by MicroPro International for Digital Research, Inc’s CP/M operating system. Subsequent releases of WordStar’s early versions were ports for microcomputers and their operating systems – for example, Tandy’s LDOS-5, the Epson PX-8, the Osborne 1, and the Apple ][.

Microsoft’s MS-DOS operating system became a platform for WordStar’s wide adoption, beginning with version 3.0 of the program. At this point, MicroPro International also began to splinter into various companies through staffing changes, some of which created direct competition with WordStar. Through our research, it became clear that the program changed hands several times, intersected with and borrowed from other pieces of software, and created a complicated pathway that created several output files that could all be called “WordStar” files. As a result, the structure of WordStar files did not exist in a streamlined, linear trajectory of updates and versioning either – changes were fairly drastic. In the first few versions of WordStar, the 8th bit of ASCII characters, usually reserved to extend the character set, was instead used to store print and formatting information. This limited cross-compatibility with other word processors and was later changed with the release of WordStar 5.0.
The introduction of additional word processing programs, such as Microsoft Word and Apache OpenOffice, minimized the market share of WordStar for its use case. The software is now hosted and available for paid download, but is no longer developed or maintained by its original owners.
Quite a few WordStar files exist within the Library’s collections, and they have unique properties that, in comparison to similar or more modern text formats, are more complex. In order to document and assist preservation and access to these files, we have been tasked with creating a Format Description Document, or FDD, for the WordStar file format. This research is still in progress, but will be published at WordStar File Format Family.
The WordStar Community
One of the most fun aspects of doing research on WordStar was seeing the passionate community of writers who still swear by the software. For example, George R.R. Martin still exclusively uses WordStar 4.0 for DOS to write A Song of Ice and Fire, the book series that inspired the Game of Thrones TV series. Hobbyists keep discussion alive in online forums, post guides on how to set up a DOS machine or emulator to run WordStar, and modify Microsoft Word to include the same key command shortcuts as WordStar. Through reading these posts, we came to understand what people love about the WordStar application. It was the first word processor that was able to render the document on screen, formatted almost exactly as it would appear when printed. The efficient command keys, used to navigate menus and perform operations such as Print or Save, are another favorite application feature, vividly explained in this post about a man teaching his 9 year old daughter how to use them. The command keys are distinct from dot commands, which aren’t just application features, but present in the actual text file. These are visible on the screen during the editing process but become formatting information when rendered into a printed page.
While it was wonderful to explore WordStar’s online community, the unofficial nature of the information, often taking the form of blog posts on someone’s personal website, somewhat complicated our research. When writing FDDs, the Library prefers to use primary sources, and at times, we were hesitant about linking out to personal sites. In order to verify information, we tried to cross reference several personal websites and news articles.
Identifying WordStar Files
Utilizing a modern graphic user interface to access, organize, and name files is quite different from creating content on the early microcomputers that many WordStar files were created on. While these systems did have directory structure, it was not always represented visually by folders, and operating systems did not necessarily associate filetypes with applications. Generally, much more was left to the end user.
Because of this, identifying the WordStar files in our collection was an unexpected challenge. By modern conventions, most people use file extensions, or the 2-4 characters preceding the “.” in a file name, as a high level but imprecise way to identify a format at a glance - “.docx” for Microsoft Word documents, “.mp3” for MP3 files, or “.csv” for Comma Separated Values. Some WordStar files may follow similar standards with .ws and optionally .ws2, .ws3, etc. depending on the version of WordStar, but other WordStar files may have very different extensions. The WordStar Reference Manual from 1983 p.1-12 states: “The most useful file name is one that helps you remember the file contents… For example, you might add .LET after each letter file name, .REP after each report or .912 to indicate that September 12 was the last editing session.” Depending on which standard an individual creator has decided to follow, this increases the difficulty of verifying WordStar as the software used to create a single file at a glance. At an institution with many word processing documents dating back to the ‘70s, this poses an issue!
Examining a file for signature information is a more consistent way to identify formats. A signature is a piece of embedded metadata used to identify a filetype, often found in the header or footer of the file.

Many of WordStar’s different versions have their own unique file signature, which greatly increases the complexity of trying to identify any given file. Signature information for WordStar 5.0, 6.0, 7.0 and 2000 are currently available in the National Archives UK’s PRONOM registry, but other versions have not had signatures identified yet.
We are also looking into unique features of the format that could also serve as identifiers. One example is symmetrical sequences which first showed up in WordStar 5.0. Symmetrical sequences serve as tags that enclose extra information like font color and footnotes. Symmetrical sequences follow a defined byte structure, and the opening and closing tags have their own control character, 1DH. Theoretically, this distinguishing feature can be a way to identify files from WordStar 5.0 and above, but it’s not as consistent as a file signature, nor is it as easy to automate checking.
When researching WordStar, we had to maintain a balance between technical and contextual information. To create a holistic format description, it’s important to describe the history and adoption of WordStar’s many versions while also providing data about byte sequences and ASCII encodings. Combining our research skills and technical knowledge to uncover more about WordStar was an extremely rewarding process. We uncovered and better understood a history of early word processing documentation, unearthed some great graphic design, and created new resources to sustain digital formats into the future!
Final Reflections
Mari: I’ve learned a lot about conducting file format research through this whole process. It’s been really fun to dive down some of the rabbit holes, from studying the full format specification and picking out useful identification information to reading blog posts and interviews from science fiction and fantasy authors. Sometimes it’s difficult to know if I’m going too deep on esoteric information, but I’ve learned through interactions with the greater file format and digital preservation community that every detail is valued. It feels incredible to contribute to a resource that will be used both within the Library, and also by outside researchers.
Dan: It’s been a great experience to discover the inner workings of what makes a file, all while contributing to an important body of knowledge for the digital preservation community. Knowing that our research may make identification and access of WordStar files easier for future researchers, and being able to contribute to Library resources, has allowed me to build new skills while also making a lasting impact as a Fellow. It also made me even more interested in early computing and technology.
Comments (14)
i was an early user of Word*Star and the famous cursor diamond using my K-PRO dual-floppy luggable computer. I even earned some extra cash typing papers in college (1980s) using that setup and my 300baud modem. Great memories!
You may want to look at WordTsar http://wordtsar.ca
Terrific overview of the topic and the issues faced by students (and preservers!) of content in digital form. Thank you!! for the helpful discussion. As an old duffer, it is reassuring to see the next generation engage these important matters. It was not the subject for this blog but many readers will wonder, “gee, WordStar files are carriers/containers for texts — and the text is the important asset, not the formatting — so, for future researchers, ought an archive migrate that text forward into a carrier/container format that is judged to be good for, um, the next 25 years?” (Of course, an archive might want to “freeze” and retain the WordStar file just in case the migration was imperfect.) Alas, that question is a bit of a puzzle too.
Dan and I really appreciate all the comments so far! We love hearing about personal experiences with WordStar and community projects.
We also welcome discussion about digital preservation challenges! There are a lot of different considerations that will vary between individual archives.
I loved WordStar. I am not a computer buff. I used WordStar when I first started using word processing software. I was writing text that included translations from other languages and was able to format the documents myself using diacritics and other special characters that aided in translation and transposition.
I could write small lines of code to support these efforts and see what it did in the text. I held out with WordStar as long as I could until Microsoft’s Word programs made it almost impossible to continue with WordStar.
Now, not being a computer techie, I must adapt my writing and need for innovation to Microsoft’s increasing overriding of individual innovations in the software. I do not have the time or expertise to try to override Microsoft’s interpretation of a user’s need to create formatting that meets my needs.
I am happy to see your efforts and wish you much success.
Thank you for this post! This is such a timely one for me as I am working with a collection full of these files. Unfortunately the files don’t have typical WS magic numbers and all of the file extensions have been changed (and many of them are sequences of numbers) but some of the header information has retained the version of WS. Typical conversion scripts for WS to Word haven’t worked, which I assume is because these files lack the typical features of a WordStar file. Trying to emulate these to see if that works better. Will update with my result!
I am the man who taught his nine-year-old daughter a few WordStar command keystrokes. Thank you for mentioning my post.
WordStar was my sole writing tool from 1982 to 1990, during which time I worked first as a freelance translator and then as a full-time technical writer. Every writing program that I used after that, I customized to make WordStar-like, but I kept WordStar around. My daughter turned nine in 1998.
By then I was also using open-source software. A WordStar-like open-source text editor is my main writing tool today.
Having done a good number of hex dumps over the years, I can imagine the difficulty of programmatically identifying WordStar “document” files from versions earlier than WordStar 5.0. It can be argued that such files are not strictly “plain text,” since (as you mention) eight-bit values are used extensively in them.
I like to imagine that I might be able to help with the creation of an FDD. I am certainly willing to try.
Paradoxically, I became a die-hard WordStar fan because I was dissatisfied with WordStar’s command keystroke assignments. In learning how to to change them, I learned why WordStar relied so much on Ctrl-A through Ctrl-Z. I changed only a few keystrokes, and assumed that some other product would come along that used all the same keystrokes in ways more to my liking.
Not only did I not see anything that came close, those keystrokes were all or mostly dead in everying I saw until I started using open-source software.
[Quote] It was the first word processor that was able to render the document on screen, formatted almost exactly as it would appear when printed. [Unquote]
As WordStar was my introduction to computers, I took that aspect for granted! The as-you-work on-screen keystroke help, the dynamic menus (the exact functional equivalent of drop-downs, before drop-downs existed), and the typing-zone-only keyboard command set are what impressed me most from the beginning.
To this day, I have seen nothing that comes close in all those regards.
I love(d) WordStar!! it was the first program (version 3.3) that I paid for-from little that I made as an enlisted USAF member in the late 1980!! (Well, after I upgraded my DOS to 2.0 on my Compaq Portable). As a touch-typist, it was a perfect program!!!!!
WordStar was the first word processor I used back in the days of DOS 2.10. I was the first person in my high school to write their papers on a computer and a word processor.
My English teacher actually tried to ding my grade because the letters were “mal-formed” by the 9-pin dot-matrix printer, and I had to appeal to the administration to have that reversed. The teacher couldn’t argue her case too hard – the school was sending notes home that were printed on the same model printer!
I loved WordStar! I graduated to it from a dedicated hardware word processor and used it to type my PhD dissertation. As a touch typist, I had a very fast typing speed (it was easy to overrun the monitor cache). I was sad to be forced into the mouse world of Microsoft Word, which caused my typing speed to crash to a crawl. Moving into LaTeX has provided a bit of deja vu for this geezer.
I started with WordStar in 1984. I was a beta tester for WordStar 7. When our firm switched to WordPerfect 5.1, I wrote a set of macros that mimicked WordStar (there were even pull down menus like WordStar). I then moved in-house and started using Word. I wasn’t able to mimic the drop-down menu structure using Word macros, but to this day I use the WordStar cursor diamond and other control-key commands to work in Word. I still believe the cursor diamond and using the keyboard is the most efficient way to type.
Wordstar created files often have “^Z”, ‘1A’x, at the end of the document to fill out the sector. This is true in “non-document” mode used to create Files intended for a compiler, such as Fortran, Assembly, C, and others. Wordstar allowed you to customize handling Tab settings and other parameters for writing in specific coding languages in non-document mode. I still use Wordstar for writing code, very fast for going through thousands of lines of code and opening split-screen for comparing and copying code.
I started using WordStar in the early 1980s and still use it to this day, running it inside of VDOS on my PC. The WS control diamond and place markers provide me far higher typing and editing speeds than in Word, which I also use daily. I copy/paste text back and forth as necessary.
I use Word macros that emmulate some WS commands, but they’re imperfect and don’t create the same speedy editing enironment I have in WS.
Also – WS lets me run simple math calculations within the text, on the fly. Very handy. Word never did that.
As a friend has said, “WordStar was a Lear Jet. Word is a locomotive.”
I started using WS in the School of Journalism at Middle Tennessee State University in 1983 when we converted our IBM Selectric newswriting labs to Northstar computers using the CPM operating system. Loading WS from the command line, and students had to bring their own 5.5″ floppy disks.
It took three hours (sometimes more) to teach the students how to format their disks, load the WS program and start to write stories. I still have a copy of the user manual I created (“Turn the computer on by pushing the red button at the top right side of the computer.” “DO NOT hit the Enter key at the end of a line, except when starting a new paragraph.”)
The news labs have long-ago been converted to “that other program which shall not be named”, but WS 7.0D, along with still-operating OmniKey keyboards, are alive and well in my office and at home.