Thursday, November 20, 2008

Classifier Milestones

Lady's and gents, I am officially absorbed. The Classifier development is seriously picking up momentum now, to the point where I'm going to need to slow down soon and do some serious refactoring to clean things up and document them better (ie: at all). Soon I'll be implementing mundane application-wide alerts and validation checks, but for now, I want to take a moment to look back at the milestones reached in the past 50 days:
  • Subsets: users can now define subsets of their data via SQL in their properties file.
    • This allows actions like "Highlight cells with CDK knockdowns." & "Show me cells of phenotype-x from my positive controls."
  • Replaced drag-and-drop mechanism with one of my own which presents no bugs.
  • Training sets can be saved and loaded. Classifier also prompts for save on close if the current training set hasn't been saved already.
  • Overcame threading hurdles so image tiles may be loaded in a separate thread while the user does manual sorting.
  • Added many shortcuts to speed up interaction.
    • Double-click on image tiles: view full image.
    • Esc: closes image viewer.
    • Ctrl+a: select all tiles on current board.
    • Ctrl+d: deselect all tiles.
    • Ctrl+i: invert selection.
    • Delete: remove selection from board.
    • Up & down arrows: scroll board contents.
    • Ctrl+1,2,3...: Show/hide channels 1,2,3... in classifier and image viewer.
    • Double-click grid-row-label: Show image/images in row group.
  • Sorting boards can be added, removed, and named.
  • Connected the classifier backend so users can now train and fetch objects from multiple phenotypes.
  • Tested the image reader on 12 & 16bit tifs, and Cellomics dibs.
  • Changed the internal image representation to float32 numpy arrays in [0.,1.]
  • Added new colors to the channel-color mapping mechanism. Users can now choose to map image channels to red, green, blue, cyan, magenta, yellow, gray, and none/hidden.
  • Implemented scoring of phenotype hits and enrichments on a per-image basis.
    • Added a table view which can be sorted by columns, launch the image viewer from rows, and save it's contents to csv.
  • Users can now define groupings in their properties file (eg: per-well, per-gene, per-plate), and use them to group their enrichment scores.
We're almost there. Today I'm hoping to package the the code for the first time with py2app and hand it over to our image assay developers to tinker with. With my todo-list growing at the rate that it is, I can only imagine how much work will lay ahead of me with their feedback.