Skip to main content

Why table join doesn't work

Everything in the traditional relational database world is based on tables. If I have a table of Classes and a table of Students, I can have an association table that tells me who's in which class.

classIdclassNameinstructor
1Psych 101Dr. Jung

studentIdfirstNamelastName
1MollyRingwald
2AnthonyHall
3JoanCusack

Clearly, these two tables tell me nothing about which classes Molly is taking, nor which students have signed up for Psych 101.

An "association" table provides that information:
classIdstudentIdAssociation-specific notes
11
12late add
13

In the relational world, a LEFT JOIN gives us the student list for a given class:

classNameinstructorfirstNamelastNameAssn notes
Psych 101Dr. JungMollyRingwald
Psych 101Dr. JungAnthonyHalllate add
Psych 101Dr. JungJoanCusack

This approach is "pure" from a Chris Date perspective, but when we're talking about JSON data in response to a RESTful HTTP request, it seems quite burdensome. Rather than invent a tabular compression scheme that allows table cells to indicate "same as previous," we want to take advantage of the fact that JSON (and PHP) allows table cells to include arrays.

Instead of the final table as:
[
 { "className": "Psych 101", "instructor": "Dr. Jung", "firstName": "Molly", "lastName": "Ringwald", "Assn notes": "" },
 { "className": "Psych 101", "instructor": "Dr. Jung", "firstName": "Anthony", "lastName": "Hall", "Assn notes": "late add" },
{ "className": "Psych 101", "instructor": "Dr. Jung", "firstName": "Joan", "lastName": "Cusack", "Assn notes": "" }
]

We do it this way:
{ "className": "Psych 101", "instructor": "Dr. Jung",
   "students": [
    { "firstName": "Molly", "lastName": "Ringwald" },
    { "firstName": "Anthony", "lastName": "Hall" },
    { "firstName": "Joan", "lastName": "Cusack" }
    ]
}

CRUD (Create, Read, Update, Delete) operations on such a many-to-many database are interesting. Only delete needs to mutate the association table. In general, Classes and Students have non-overlapping CRUD. We can Create or Update a class or a student dealing only with that table. When we Delete from either table, we have to Delete from the association table, as well.

Read could be limited to a single table, but we have just demonstrated how we plan to use arrays to report relevant records from the secondary table. We can look at classes to see a list of students taking each class, or we can look at students to see lists of the classes they are taking.

We want to extend between primary and secondary tables in the same way, perhaps, in order to establish and drop these associations. There is no point updating an association, and reading is covered using arrays as described. So, we have Create and Delete.

Create is tantamount to "Record that this Student is now taking that Class." Delete is the same: "Record that this Student is no longer taking that Class." There is never an instance in which cancelling a class expels students from school entirely. We're only talking about creating or deleting this relationship.

We could indicate this using classId and studentId:

CREATE (a relationship between) classId, studentId.

This is akin to CRUD on the association table itself. As such, it involves the UI programmer too much in the underlying mechanism. Also, as a practical matter, the UI is usually looking at one master table or the other--a class and its students, or a student and their classes. To add a student or a class, we would have to magically know its id.

The solution is that, when looking at a Class record, the interface includes a StudentPicker that converts a student name entered into a concrete studentId. We already know the classId.

It seems reasonable to expose an asymmetric facility: You can add students to classes, but you can't add classes to a student. There is no technical reason why it can't work both ways, but the simplification is easier to explain and seems like common sense.  Looking at a class roster, a picker helps you find one of all the enrolled students. One student is the same as another--all people with names--so you begin entering the name and you find it. Classes, on the other hand, might be named oddly and are clustered around categories and fields of study. Molly can't just start typing "Brain Science" to find "Psych 101."

But, we conclude that our Association C--D takes two ids and perhaps an association note. There is no reason to specify names. However, the response to Create should be more verbose so that the UI can display information about, in this case, the new student. If we POST a new student, the response is the student record. Likewise, if we POST a new student in a class, we only provide the studentId, but the response should be the student record. A rich UI might display the photo of the student added to the class.





Comments

Popular posts from this blog

JSON/MySQL Schemas

As noted previously, there is a lot of overlap between the RDBMS world and the JSON world. Identifiers JSON is defined to allow identifiers of any kind of Unicode string, encoded in UTF-8 (or UTF-16, etc.). They begin and end with double quotation marks (U+0022), so included quotation marks and '\' must be escaped as \" or \\. Control characters (ASCII 0-x1F) must be escaped as well. In practice, JSON identifiers conform to ECMAScript standards . There are some 68  reserved keywords (function, import, for, if, else, and so on) that should not be used as identifiers. Unexpected reserved words include abstract, await, debugger, delete, finally, instanceof, super, synchronized, transient, volatile, and yield. The spec makes a distinction between identifiers and IdentifierNames (specifically, array keys), but why risk it? ECMAScript allows '$' and '_' anywhere in an identifier. Length or camelCasing are not part of the spec. As for length, there seems t...

Relational Updates

The pattern I've been focused on for some time is the classic 'students in classes' or 'passengers on flights' problem. The database has a table of students and a table of classes and an association table which joins them. That's basic stuff. The problem is how do you update a student to edit her list of classes, or update a class to edit its list of students? The relational integrity is not a problem--if you delete a class from the student, that student no longer appears in the class' list of students. The problem is your simple form gets very complicated. Instead of a form that displays the columns of a single row in a table, now we have three tables involved. Of course it's solvable, but we want it to be intuitive for the end user. Old time software punted the problem--you could edit the association table, or a class, or a student, but you couldn't add classes to a student. What we want is a list of classes on the student record. Each class ...