So! My thought is that I can use this blog to verbalize what I'm doing and how, and hopefully this will help me clear my mind and focus. This is something I haven't done before (heck, I've only made 2 posts in this blog so far), but I'm hopeful this will also help to re-instill that nerdy sense of accomplishment and learning that I felt when I first got into computer science.
The nature of our data is such that it is stored in pairs of tables (per-image-table, per-object-table). Yesterday I wrote MYSql statements to merge 3 table-pairs into a single pair.
# Create a new database where the master tables will be stored
# Create a master per-image table called "im" with the same definition as the old one
# Add a column "TableNumber" to keep track of which table each row is from
# Update the primary key
# Populate the dataset
CREATE DATABASE af;
CREATE TABLE im LIKE imageScreen.plate1_Per_Image;
ALTER TABLE im DROP PRIMARY KEY;
ALTER TABLE im ADD COLUMN TableNumber INT;
ALTER TABLE im ADD PRIMARY KEY (TableNumber, ImageNumber);
INSERT INTO im SELECT *,0 FROM
imageScreen
.plate1_Per_Image;
INSERT INTO im SELECT *,1 FROM
imageScreen
.plate2_Per_Image;
INSERT INTO im SELECT *,2 FROM
imageScreen
.plate3_Per_Image;
ALTER TABLE im MODIFY COLUMN TableNumber INT FIRST;
# Do the same for per-object tables
CREATE TABLE ob LIKE
imageScreen
.plate1_Per_Object;
ALTER TABLE ob DROP PRIMARY KEY;
ALTER TABLE ob ADD COLUMN TableNumber INT;
ALTER TABLE ob ADD PRIMARY KEY (TableNumber, ImageNumber, ObjectNumber);
INSERT INTO ob SELECT *,0 FROM
imageScreen
.plate1_Per_Object;
INSERT INTO ob SELECT *,1 FROM
imageScreen
.plate2_Per_Object;
INSERT INTO ob SELECT *,2 FROM
imageScreen
.plate3_Per_Object;
ALTER TABLE ob MODIFY COLUMN TableNumber INT FIRST;
The next goal is to build this idea into a wx.Python GUI. A simple wizard seemed like the best way to break it down into the following steps from the user perspective:
- Connect to the source database. [host, dbname, user, pw]
- Choose which table pairs to include in the merge. [table name list]
- Choose a destination database to write master tables to. [dbname]
- Choose a prefix for your table names. eg: (prefix_per_image, prefix_per_object). [prefix]
At present, the tool can perform steps 1 & 2 listed above, but some challenges still remain. Namely:
- Handle the rare case of merging only per_image tables.
- Ensure that the destination database does not already contain tables with the same name as the output tables.
- Provide capability to remove these tables if they were created with this tool.
- Handle unpaired tables.
- Warn for unpaired "per_image" or "per_object" tables.
- What about metadata tables? eg: "plate_map"
No comments:
Post a Comment