Several times over the past couple of years, in my role as a database consultant, I've come across very, very large databases, where a large percentage of the data is redundant. For instance consider the following two abbreviated tables
| Forms | |
| FormsID (PK) | nvarchar(255) |
| Hits | |
| HitsID (PK) | int identity(1,1) |
| FormsID (FK) | nvarchar(255) |
| Time | datetime |
Imagine that there are only a dozen forms values, each of them averaging about 30 characters in length (so 60 bytes or so, giving that it's unicode). If you have a million records in hits, that's 60MB just for the form value itself. If you have one hundred million records, and a dozen large FKs like this, well you get the picture. It vastly increases the amount of I/O to do searches in the Hits table, and even if Forms is indexed it's still much slower than it could be if Forms had an integer primary key.
While I personally wouldn't layout tables this way, it is an entirely credible and justifiable design - the designer simply decided to use a natural key rather than a surrogate key. Simplicity of design, and clarity of relationships when looking at the data, outweighed I/O concerns for this person/group. Such a design is not a question of normalization.
When you have a million+ records it suddenly becomes a concern, though. There are ways to refactor this design, including "normalizing" the original table a bit and hiding it behind a view, and then adding INSTEAD OF triggers on the view, however that is a leaky abstraction. SQL Server does not completely mimic a real table, and operations like INSERT FROM fail, not to mention oddities with @@IDENTITY and SCOPE_IDENTITY().
Given all of this, I would love if SQL Server had a behind the scenes method of collapsing redundant large field values into a hidden behind the scenes lookup table, similar to what Visual Studio does with string pooling. e.g. In this case it could replace FormID with a internal value to lookup against a tiny relational table. Obviously this should be manually configured, but it would be a relatively easy change that could tremendously improve a lot of existing database designs where a redesign isn't a priority, but I/O costs are onerous.
Apparently mySQL has something similar by way of enums, however it is a fixed set (what I'd like is that new values inserted into the table are automatically added into the behind the scenes set), and again there is some leakiness with the abstraction.