Skip to main content

Relational Updates

The pattern I've been focused on for some time is the classic 'students in classes' or 'passengers on flights' problem. The database has a table of students and a table of classes and an association table which joins them. That's basic stuff.

The problem is how do you update a student to edit her list of classes, or update a class to edit its list of students? The relational integrity is not a problem--if you delete a class from the student, that student no longer appears in the class' list of students. The problem is your simple form gets very complicated. Instead of a form that displays the columns of a single row in a table, now we have three tables involved.

Of course it's solvable, but we want it to be intuitive for the end user. Old time software punted the problem--you could edit the association table, or a class, or a student, but you couldn't add classes to a student.

What we want is a list of classes on the student record. Each class has a 'X' control for deleting and an add button brings up a list of thousands of possible classes, which winnows as you type. I haven't got it working yet, exactly, but Don McCurdy's Tokenizer jQuery plugin seems up to the job.

The code for editing a student record is pretty straightforward:
  1. Read 3 db tables to produce a view.
    1. There are two classes fields: one a list of class_ids and one a list of class_names
  2. Display the view with the list of class_names and let the user modify the fields.
    1. When the user deletes a class_name, delete the corresponding class_id.
    2. If the user wants to add a class, display the typeahead class picker and, if they choose one, add it to both the class_name list and the class_id list.
    3. If the user can't find a class, we can optionally let them add a new class right here. (It has to be lightweight--just a provisional placeholder with little more than a class name.)
  3. When the user clicks 'Update,' we post the view record with only the class_id list.
    1. Perhaps we include a list of class names to add (or we use Ajax to add the new class name and return an id). 
This is too complex, right? But the use case is compelling. Students and classes is a bad example, because money is involved spinning up a class or becoming a student. Consider a list of people involved with recording an album. You learn that Derek Smalls helped with Revolver. You're not surprised that he's not in the people database, but you want to associate him with Revolver. That involves cancelling the edit of Revolver, go to people, add Derek, go back to Revolver and Derek is now on the pick list. Chicken, meet egg.

So we make it easy to add to the people table, using Ajax or with a list of deferred adds. Using Ajax simplifies the post processing--we have a list of ids, not two lists. But using Ajax makes real changes when a user is just goofing around. If they add a name, or multiple names, and cancel the edit, the orphan goof names pile up, polluting the pick list for other users. The list of deferred adds is just names. The backend needs to add the names to get ids for them, then add those to the list of ids before processing the update.

MySQL Views

Updating three tables (Students, Classes, Association) is easy, especially when you ignore transactions due to extremely low concurrency. (If you have lots of people updating your database simultaneously, good for you--hire one of them to add transactions to your database code.) I've always wondered whether it made sense to get SQL involved in three-way updates.

CREATE VIEW test AS SELECT class_id, person_id, role, people.id as pid, last, first, classes.id, class_title  FROM people, classes, assoc WHERE assoc.class_id=classes.id and assoc.person_id=people.id;

SELECT * FROM test LIMIT 25;


The view is a table with columns: class_id, person_id, role, pid, last, first, id, title. It is sorted (by default) by class_id. Each class has multiple rows, as does each person. Instead of updating a single row in a table, we need multiple rows to show the students in a class.

When you edit a person and add or drop them from a class, the update is really an INSERT or a DELETE on the assoc table with potential other changes to the people table. The classes table doesn't change. Likewise, when you edit a class and add a person,


Comments

Popular posts from this blog

A JSON Db Product?

The last post "solved" the problem of many-to-many table joins by papering over the association table with a RESTful JSON interface. As long as we're using JSON, we might as well take advantage of multi-valued table cells. I'm naturally wondering where this leads. JSON identifiers and types and SQL identifiers and types overlap so much that their intersection is a useful subset. Camel-case fields in string, number, bool flavors. Many-to-many occurs often in the world: Students in Classes Actors in Films (musicians on recorded songs) Parts in Assemblies Customers and Products (joined by Orders) The generalized description is that a Table requires a unique identifier for each row. Tables list students, actors, films, customers, and so on.  An Association Table is has two or more foreign keys that match unique identifiers in other tables. The knowledge of how a FK maps to a specific Table is baked in--we wouldn't want a "table name" column....

JSON/MySQL Schemas

As noted previously, there is a lot of overlap between the RDBMS world and the JSON world. Identifiers JSON is defined to allow identifiers of any kind of Unicode string, encoded in UTF-8 (or UTF-16, etc.). They begin and end with double quotation marks (U+0022), so included quotation marks and '\' must be escaped as \" or \\. Control characters (ASCII 0-x1F) must be escaped as well. In practice, JSON identifiers conform to ECMAScript standards . There are some 68  reserved keywords (function, import, for, if, else, and so on) that should not be used as identifiers. Unexpected reserved words include abstract, await, debugger, delete, finally, instanceof, super, synchronized, transient, volatile, and yield. The spec makes a distinction between identifiers and IdentifierNames (specifically, array keys), but why risk it? ECMAScript allows '$' and '_' anywhere in an identifier. Length or camelCasing are not part of the spec. As for length, there seems t...