When to class / when to table
So, there’s a related issue in database design and in object oriented programming, and it surrounds the question of when do I break this type of data off into a new table / new class?
Fully normalized data is a pain in the ass to work with. You have to join to find anything you care about, and there’s a performance cost to doing joins, especially outer joins. On the other paw, fully denormalized data is also a pain in the ass to work with. It can be very expensive to search that enormous haystack for that tiny needle, ALTER TABLEs take forever to run, etc.
In the programming world, if you create too many object classes, it’s a royal pain in the ass to find anything, your executable size is going to go up, and unless you do a very good job of inheretince you’re going to be doing cut & paste coding every time you add a good feature. On the other paw, if you create too few object classes, you’re going to find they get large and cumbersome as you have to add many methods to them for the varying sorts of data they’re carrying. Again, maintainability goes down, readability goes down.
So, how do you decide when it’s time to tack on a table or a class? I don’t really know how I make this decision – there’s some sort of intuitive leap that happens inside my mind that says ‘now would be a good time for another table / another object’. Sometimes there are clear data bounderies – a map coordinate probably doesn’t belong in the same table as a phone number, because it’s a very different type of data. Sometimes external APIs suggest a path, because of the way their interfaces are defined. And so forth.
I don’t have a good answer to write down here yet. I’m still thinking about this. But if anyone wants to comment, I’d be happy to hear your thoughts on the matter. (I think I have about 3 readers at this point, although my web traffic statistics would suggest that’s not correct)
July 11th, 2016 at 10:48 am
For database tables, I tend to think about what I would need search for or on. I probably split things in ways that school-trained programmers might think odd, but part of that is for security (i.e. I don’t store password hashes in the User table). I will also admit that I sometimes over-think the whole thing, and break the project down to “lowest common denominators”, before triggering my overwhelmed alarm and back-tracking to more sane groupings.
Sadly for me being able to think in Code is a seasonal ability. I can sometimes push into codethink for short periods out-of-season, but the result is not always a positive gain. Perhaps I will have a more coherent and helpful comment in a handful of months.