In SQL RDBMS land, people often talk about relations when they actually mean relationships. Sometimes the meaning is clear from the context, but that’s not always the case; so I thought it might be helpful to know which is which and why:
You may be surprised to learn that: relation = set = table. It’s just the mathematical term for it. The letter R in the RDBMS acronym therefore simply means that it’s a type of database that stores its data in tables. Nothing more.
A relationship, on the other hand, are about two or more tables having a field with common data (generally of integer type, if the schema is decently normalized) used for joins. In its simplest form that’s an implicit relationship, and the database doesn’t worry about it one way or another. You can make relationships explicit by defining foreign key constraints.
A join will work without a foreign key constraint, and a foreign key is still a foreign key without a constraint, it’s just that the server in that case wouldn’t care about you deleting a row from one table while a referring row in another table still exists. That’s what foreign key constraints guard against.
Hi!
Nice to see some of this discussion too sometimes – ususally, the techie blogs tend to favour the more practical issues.
Just want to point out that if you want to be pedantically picky (now who would want to to that ;-), a set != table.
A table is a logical data structure that you can use to hold a set: it is an implementation term. Also, a set has – by definition- no duplicate members. There’s nothing that stops us from having duplicates in a (database) table.
Also, a relationship does not have to be based on common fields. There is this exotic feature called the non-key relationship. For example, taxes in the Netherlands are divided into several ‘scales’. Each scale has an upper and a lower bound. A person is taxed according to the scale if the person’s income is in between the scale bounds.
So, if we would have a relational data model, we would have one table for the persons, with a column for their income. And we would have a table for the tax scales. Now it is clear that there is a relationship between these two tables, because we can relate the tax scale catergory from both the scale bounds and the person’s income. However, most of the incomes will fall somewhere between the scale boundaries – so there are lots of persons that do NOT have a field of data in common with either of the scale boundaries. That’s why it’s called a non-key relationship.
Another example of the non-key relationship is seen in the nested set model for handling hierarchies. Here too, it is clear that a particular node and it’s ancestors are related – else the model would not make much sense. At the same time, there is never any ancestor whose id is litarally equal to either the left or right pointers of the ancestor. So, no common data, but a relationship regardless.
Roland Bouman
PS – how about a forum for data modelling and database design issues? just a thougt, it sure would take some pressure of the newbie and general forums
Thanks for the additional insights.
Insisting that sets don’t contain duplicates would mean that we can’t call a result set a result set (since it can contain duplicates ;-)
How practical is that?
I’ll see if I can create a database modeling & design forum – good idea!
Well, once it has become a familiar term, not that practical I guess. But I thought the point of your blog was ‘…to know which is which and why.’, and in that context, it seemed a nice addition.
Anyway, that’s why I avoid the term relation too: chances are that someone will get you wrong.
I think it should be possible for most texts too just bluntly substitute the term ‘result set’ by the term ‘rows’ or ‘result rows’. (The term ‘table’ or ‘result table’ is more precise, but confusing since most people will immediately think of a ‘base table’.) It’s nice, informal, and totally correct as far as I can see.
Anyway, here’s another term used in many ways: ‘Query’. I tend to use this as a synonym for ‘outer sql select expression’ s, but it is also widely used to refer to DML statements, and sometimes even DDL (‘…my alter table query has got a syntax error…’). Of course, there’s almost never any confusion over this, and that’s probably why this term is used so often for all kinds of sql statements.