TL;DR: Valve could save the world about two petabytes of storage just by deduplicating the contents of their localization/translation files, and it gets worse from there.
I've recently become annoyed as I realize just how bloated many modern games have become. This was a moderate problem 5-10 years ago, but it has gotten completely out of control since then. As I was making a short list of games I've recently considered installing but skipped due to their size (4GB for a top-down low-poly low-texture platformer? 30GB for a RTS? 83GB for a MMORPG?) I found myself installing a 117MB update for Steam itself. In the software-bloat mindset, this prompted an immediate "wait, what?", leading to the writing of this post.
First, I headed over to `/Applications/Steam.app` (yes, I am on a Mac, but [almost?] everything below is platform-agnostic except for the paths). Here we have a total of 4.6MB, comprised of 3.3MB for the binary, 1.1MB for a crash reporting library, and 0.3MB for translations. I'm not going deeper into the binary or library; let's say those are reasonably sized and necessary/useful. That brings us to the translations. Translations are great! I'm definitely not saying they shouldn't be there, or that they should be less thorough. However, I'm going to be specific about this problem, because it becomes more important later. The English strings appear number_of_languages times; Do we really need 26 copies of every string to be translated, including:
%appname% has changed where it stores game content from '~/Documents/Steam Content' to '~/Library/Application Support/Steam/SteamApps'. You have game files in the original location, and Steam was not able to move the files because files already exist at the new location. You may need to move the files manually, or delete the old files and download your games again. Continue anyway?
The labels for the strings appear number_of_languages*2-1 times, making 51 copies of "SteamBootstrapper_InsufficientDiskSpace". Finally, "[english]" appears a whopping 900 times, once for every string in every non-english language. A simple .tar.gz of this folder cuts the size by 92%. Assuming there's some good reason the files aren't compressed, the content could still be deduplicated and the necessary duplication handled in whatever code loads them, which might be an 60% reduction in size. I am hopeful that all of this was compressed when downloading, but that doesn't solve the problem of it wasting space on disk, which gets more important later.
Next, I went to `~/Library/Application Support/Steam/`. I realized there's a lot of user-specific stuff in there, so I created a new user to launch Steam once and get a clean slate. 202MB of downloads leading to 757MB of disk space used. I will mention here that installing apps and games for just one user, and requiring the installation effort and disk space to be duplicated for a second user, is a terrible paradigm when it's not explicit and intentional, but that's a different rant for a different post. Let's dive into that 757MB, and I'll skip any parts that I have no objections to.
168MB for `Steam/Steam.AppBundle/Steam/Contents/MacOS/package/` which contains a bunch of archives that *seem* to match up to extracted content in the other directories. Does `package/tenfoot_fonts_all.zip.vz.SOMEHASH` contain a copy of `tenfoot/resource/fonts/*`? If the Steam client can use these files to repair itself in case of damaged or lost files, that's great. If it can't, then why are they still there?
147MB for Chromium Embedded Framework. I can't blame Valve here; the fault for not sharing libraries like this with other applications and keeping them in a central place falls on the shoulders of Apple and Microsoft and the environments they have developed. I am potentially giving Google or Apple a bit of side eye here for 14MB of translations, 200-500kB per language, depending on who is to blame for the contents of the *.lproj/locale.pak files, and how much space is wasted in there.
137MB for the tenfoot (Big Picture) interface:
63MB for `tenfoot/images/` on which I just deleted 500 words of detail here upon realizing I hadn't covered even 10% of it. TGAs that should be PNGs, PNGs that should be JPEGs, including some that appear to already have JPEG artifacts, PNGs that should be vector graphics, and so on.
31MB for `tenfoot/localization/` which has the same data duplication problems as from Steam.app, writ large. 177856 copies of "[english]" and 26 copies of:
Intended to be used with a dual-stage trigger setup, Hip Fire allows a quick pull of the trigger to engage the click without engaging the threshold. A slower pull or hover will engage the threshold action. This allows for actions such as iron-sights to be set on the threshold and fire on the click, while still allowing a quick pull to click to fire without entering iron-sights. Additionally, once the click has been hit, the threshold won't be engaged until the trigger has been released outside of the threshold range, allowing it to be primed for additional clicks. Relaxed Hip Fire mode is a larger window before engaging the threshold, allowing a slower pull of the trigger to avoid it. This means the Threshold action will be slightly less responsive when intentionally trying to engage.
11MB for `tenfoot/sounds/`, most of which is ambient background MP3, but also a bunch of WAV sound effects despite the evidenced support for MP3s.
That's all for Big Picture, back to top level Steam contents.
36MB for translations in `public/`, again with all the duplication in the previously mentioned translation/localization files.
20MB for 'graphics/' which is full of TGA files that should be PNGs and/or procedurally generated, most notably the 4.2MB music_details_mask.tga that is just a checkerboard on top of a radial gradient, and clienttexture*.tga which are just gradients on a flat background.
The rest of the smaller directories contain many more examples of poor image format selection, duplicated translated strings, etc, which I won't further enumerate here. If Valve ever decides to address the problems above, they will probably solve the smaller instances of the same problems as a side effect.
Having gotten this far, my original "wait, what?" actually still remains unanswered. Almost everything I've just described seems mostly static, images and sounds and text files and third party libraries. The binaries and first party libraries are actually pretty small. I can't point to 117MB worth of content that I expect to have changed in the update today, let alone the ~300MB that I would expect from the compression ratio evident in the original client download.
http://store.steampowered.com/news/38412/ has the changelog, where I see a lot of what should be code/library/driver changes, none of which touches on all the pieces of the client that seem to take up tens to hundreds of MB of space each. Maybe I'll find some insight into the update size. Maybe I'll go do this same teardown on those three games, or at least two seeing as how I don't even have 83GB of free disk space right now.