UPDATE for webdev nerds: We realized we meant “403 Forbidden” rather than “401 Not Authorized”.
Here’s a little information leak we noticed and fixed some months (ok, the better part of a year — blush!) after we publicly launched in late 2011:
Say our user Alice has two goals. One is her book reading goal which she wants to share with her book club, beeminder.com/alice/reading. She also has a super embarrassing private goal, beeminder.com/alice/lessporn.
Now say that her co-worker, Noah, learns about her book goal and wants to see what else Alice is beeminding. Nosy Noah can start poking around and trying Beeminder URLs. If we give a 404 Not Found when he navigates to beeminder.com/alice/kick-more-puppies, that’s all well and fine. But then if we give a not-allowed response when Noah guesses beeminder.com/lessporn, we’ve just given away that goal’s existence. So even though that naughty Noah can’t view the goal, he knows it is there.
“You can’t collect information on what Beeminder goals someone has just by trying URLs and seeing if they’re not-allowed vs not-found”
Our solution is to give a not-allowed error for any random beeminder.com/alice/stuff URL that Noah tries (unless it’s a real and public goal, of course). That way Noah can’t collect information on what Beeminder goals Alice has just by trying URLs and seeing if they’re not-allowed or not-found.
As was all obvious from the start for anyone with an ounce of Slytherin in them. Hence, Slytherin 404s. Which is what we propose calling this little security best-practice from now on.
404 Not Found vs 401 Not Authorized vs 403 Forbidden
In one sense it doesn’t matter whether you say “not found” to everything or “not allowed” to everything. The point is not to tip off the bad guys by doing something different depending on whether the URL exists.
Interestingly the HTTP spec itself, when talking about what to do when someone tries to access forbidden resources, says:
If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.
“Humans may sometimes need to lie to fully conceal sensitive information but computers should never need to”
This is pretty quibbly but we disagree with the spec on that. It should be a “Not Authorized” (or “Forbidden”) that’s used for both not-authorized and not-found if you want to conceal the existence of hidden stuff. You’re not authorized to even know if it exists! Giving a 404 is technically a lie. Humans may sometimes need to lie to fully conceal sensitive information but computers should never need to.
If you want to see this in the wild, check out my own possibly nonexistent goal (assuming you’re logged in to Beeminder, otherwise it will redirect you to do so):
beeminder.com/d/truthtelling
(We decided to go with “403.5 Not Found!” to avoid the case where a Nazi goes to beeminder.com/alice/jewhiding, jumps to the conclusion that such a goal exists, and storms Alice’s house before following the link and reading this blog post.)
UPDATE: I want to add that this isn’t pure philosophy. Even pragmatically, I actually hate the lying. If you’ve mistyped your own username or are accidentally logged in to your roommate’s account or something it’s super confusing and frustrating to be given a 404 and start trying variations of the URL. Or even the brief moment of panic when you think something got obliterated. I strongly disagree with, for example, GitHub’s approach to Slytherin 404s for this reason.
PS: The other Slytherin solution to this in the case of Beeminder goals is for Alice to let all her goals be public but just give the embarrassing ones plausibly deniable codenames, like /lesspork. (Sorry to have just ruined that one! Even more sorry to anyone who may have had a for-real less-pork goal!)
Thanks to Sean Fellows for helpful discussion.