We shipped a bug in Starbrew Station that took us nearly a week to track down. A player would start a new game in save slot 2, and the game would silently load slot 1's data instead. No error. No crash. Just a fully furnished station that shouldn't exist yet, and a very confused player wondering why their "new game" had 40 hours of progress on it.
This is the story of a one-line bug that taught us a lot about state synchronization in save systems. If you are building a multi-slot save system in Godot -- or any engine, really -- we hope this saves you the same headache.
The Bug Report
The report came in through our Discord. A playtester had been running a long session on slot 1, returned to the main menu, and started a new game on slot 2. Instead of a fresh start, they got their slot 1 station -- same resources, same upgrades, same crew. Saving and reloading made it worse: now slot 2's file on disk contained slot 1's data, effectively cloning the save.
Our first instinct was to blame the save file I/O. Maybe FileAccess was writing to the wrong path, or our JSON serialization was pulling from a stale buffer. We added logging around every file read and write. The paths were correct. The data being written was correct. The bug was upstream of the file system entirely.
Tracking Down the Root Cause
Starbrew Station uses a UnifiedSaveService autoload that manages multiple save slots. The architecture is straightforward: a config file stores which slot is active, and the service reads that config on startup and whenever the active slot is queried. Here is a simplified version of the pattern we were using:
# UnifiedSaveService.gd (BUGGY VERSION)
var _active_slot: int = 1
var _config_loaded: bool = false
func set_active_slot(slot: int) -> void:
_active_slot = slot
_save_config_to_disk()
# BUG: we forgot to set _config_loaded = true here
func get_active_slot() -> int:
if not _config_loaded:
_load_config_from_disk() # reads the slot number from disk
_config_loaded = true
return _active_slot
func _load_config_from_disk() -> void:
var file = FileAccess.open("user://save_config.json", FileAccess.READ)
if file:
var data = JSON.parse_string(file.get_as_text())
_active_slot = data.get("active_slot", 1)
file.close()
func _save_config_to_disk() -> void:
var file = FileAccess.open("user://save_config.json", FileAccess.WRITE)
file.store_string(JSON.stringify({"active_slot": _active_slot}))
file.close()
Do you see the problem? When set_active_slot(2) is called, it correctly updates _active_slot to 2 in memory and writes that value to disk. But it never sets _config_loaded = true.
So the very next time anything calls get_active_slot(), the guard clause if not _config_loaded evaluates to true. The service dutifully reloads from disk. Now, you might think: "But we just wrote slot 2 to disk, so the reload should return 2." And in a simple test, it does.
The real problem emerged because of timing. In our game flow, the sequence looked like this:
- Player picks slot 2 from the menu.
set_active_slot(2)is called. - The scene transition begins. During scene teardown, an autosave triggers on the old scene.
- The autosave calls
get_active_slot(), finds_config_loaded == false, and reloads from disk. - The disk read returns 2 -- but because this is happening during the transition, there is a brief window where a different config write (from the teardown of the old session's managers) can race and overwrite the config back to 1.
- Now
_active_slotis 1 again in memory, and all subsequent operations target slot 1.
This was a classic state synchronization bug. The in-memory value and the on-disk value went out of sync, and a reload from disk at the wrong time pulled in stale data. The _config_loaded flag existed precisely to prevent redundant disk reads -- but because we never set it after an explicit slot change, it was acting as a trap door back to the old state.
The Second Bug: Manager State Bleed
While investigating, we found a second related issue. When a player returned to the main menu, our 50+ manager autoloads were not being reset. They still held references, counters, and cached data from the previous session. So even if the slot number was correct, starting a "new game" would inherit stale state from the old one.
In an idle game like Starbrew Station, managers track everything: resource counts, upgrade tiers, crew assignments, recipe unlocks, event timers. If even one manager retains old data across a session boundary, the player gets a corrupted experience.
The Fix
The fix for the first bug was a single line. When setting the active slot, we mark the config as loaded immediately, because we are the source of truth -- there is no need to reload what we just set:
# UnifiedSaveService.gd (FIXED VERSION)
func set_active_slot(slot: int) -> void:
_active_slot = slot
_config_loaded = true # we just set it; no need to reload from disk
_save_config_to_disk()
By setting _config_loaded = true atomically with the slot change, any subsequent call to get_active_slot() will return the in-memory value immediately, without touching the disk. The race condition disappears because there is no longer a window where a reload can be triggered.
For the second bug, we added an explicit reset step when the player returns to the main menu:
# MainMenu.gd
func _on_return_to_menu() -> void:
GameManager.reset_all_managers()
_show_slot_selection()
# GameManager.gd
func reset_all_managers() -> void:
ResourceManager.reset()
CrewManager.reset()
UpgradeManager.reset()
RecipeManager.reset()
EventManager.reset()
StationManager.reset()
# ... all other managers
print("[GameManager] All managers reset for new session")
Each manager's reset() method clears its internal state back to defaults -- zeroing counters, clearing dictionaries, disconnecting any session-specific signal connections. This ensures that when a new game starts, it truly starts fresh, regardless of what happened in the previous session.
Lessons Learned
This bug was small in terms of code but large in terms of impact. Here are the takeaways we are carrying forward:
- Treat setter/getter pairs as a contract. If your setter modifies a value, any related cache or guard flags must be updated in the same operation. A setter that changes state without updating the "is loaded" flag violates the contract that the getter relies on.
- In-memory state is the source of truth after a write. Once you have explicitly set a value, there is no reason to reload it from disk. The disk is a persistence layer, not an authority layer. If your code can re-read a stale value from disk after an explicit in-memory write, you have an implicit race condition.
- Session boundaries are state boundaries. Returning to a main menu is a full session reset. Every manager, every cache, every singleton that holds session-specific data needs to be wiped. We now treat "return to menu" as equivalent to "soft restart the game."
- Autosave is a hidden caller. Autosave systems run on timers or scene lifecycle hooks, which means they can fire during transitions when you least expect it. Any function that autosave calls must be safe to invoke at any point in the game lifecycle, including during teardown.
-
Log your slot reads, not just your writes. We had logging on every save and load, but we were not logging which slot the system thought it was targeting. Adding a log line to
get_active_slot()would have immediately revealed the stale read.
A Simple Test to Prevent Regression
We added a GDScript test that validates the setter/getter contract directly:
# test_save_service.gd
func test_set_active_slot_does_not_reload_from_disk() -> void:
var service = UnifiedSaveService.new()
service.set_active_slot(3)
# Corrupt the disk value to prove we are not reading it
var file = FileAccess.open("user://save_config.json", FileAccess.WRITE)
file.store_string(JSON.stringify({"active_slot": 99}))
file.close()
# get_active_slot should return 3, not 99
assert(service.get_active_slot() == 3,
"get_active_slot must return in-memory value after set_active_slot")
This test writes a deliberately wrong value to disk after calling set_active_slot(). If the getter ever reloads from disk instead of trusting the in-memory value, the assertion fails immediately. It is a simple, targeted test that would have caught this bug before it shipped.
Final Thoughts
Save systems are deceptively complex. They sit at the intersection of file I/O, state management, scene lifecycle, and player expectations. A bug in your rendering pipeline is visible immediately. A bug in your save system might silently corrupt data for days before anyone notices.
If you are building a multi-slot save system in Godot, we would encourage you to think carefully about the contract between your in-memory state and your on-disk state. Treat every write as establishing a new source of truth, and make sure your read paths respect that. And always, always reset your managers on session boundaries.
The fix was two lines of code and one new function call. The lesson was worth a week of debugging.