Sam's profileSamb Business Intelligen...PhotosBlogListsMore Tools Help

Blog


    September 29

    National Science Foundation Informational Graphics Winner

    http://www.nsf.gov/news/special_reports/scivis/index.jsp?id=win2007

    The NSF 2007 Winners for Informational Graphics were announced recently...

    batflight_lg

    Modeling the Flight of a Bat
    Credit: Kenneth S. Breuer, David J. Willis, Mykhaylo Kostandov, Daniel K. Riskin, Jaime Peraire David H. Laidlaw, Sharon M. Swartz

    And one from last year I think is interesting...

    materials_lg

    Materials Informatics: Visualization of High Dimensional Combinatorial Data
    Credit: Matt Heying, Changwon Suh, Krishna Rajan, James Oliver, Iowa State University; Simone Seig, Wilhelm Maier, Universität des Saarlandes

    September 26

    Visualizing Gene Expression Data - Working with Unique Formations - Part II

    OK - so, I took these long skinny data sets that we talked about last time and laid them out in various ways, based on different metaphors.

    The idea is to leverage a layout to eliminate the scrolling and moving around that is necessary when data is long and skinny.

    One way to do it is to scale one side so that the formation becomes more rectilinear - like a square.

    So, in this case, we have 8 genes and 12,000 cell names - we scale the 8 up to start to match the offset by the cell names.

    genesuare

    Another way to look at this is to stack them up like books - so the 8 genes go vertical...

    genelibrary

    It still makes us scroll around a lot though...

    How about steps - this leverages perspective a little better than the book cases...

    genestairs

    Hmmm, I wonder what a curve might do to it?

    genewave 

    OK, that a little strange looking - what about something a little more subtle?

    geneshallowwave

    It was at this visual that I realized something.  What if this was a loop - a donut?  the height and skinny attributes are leveraged regardless of their dimension.

    genecircle

    So, it's not totally obvious what I'm up to here, but if you rotate this hoop and zoom in you get this kind of a view...

    generadial

    Now, I can put in a wall behind the front part of the hoop and conceal the back.

    Next time, I'll show you the finished product.

    September 25

    Visualizing Gene Expression Data: Long and Skinny Data Sets - Part I

    Anirban and I are looking at some Breast Cancer Gene Data from NCI (Public Domain).  It's a tall and skinny data set - only 4 rows wide and 12,900 high.  Every time I deal with extremely long data (either horizontal, like a timeline, or vertical like this expression data) it turns out to be difficult.  Lots of scrolling and zooming in the context (overview) and always a struggle to provide proper detail.

    Here's a quick look at the data set...

    genedataset

    Basically we want to create a matrix of cell names to genes with the expression being the intersection.  These expression can be all kinds of things, but typically they represent how well they "go" together - also known as binding.

    My first shot was to do a typical 3D landscape of this data using DirectX.  So here's the total width of the dataset (by cellname) and height by Gene.  You can see a small detail window here that helps you pick out the details.  You can also see a Moire pattern starting in these small boxes as they alias against the screen resolution.

    geneoverview

    From a functional perspective we want to be able to look at the actual, min, max, median and mean of these values.  I also allow a user to flick some sliders and filter the actual points between two settings (as seen below)...this is showing the points between 0.6 and 1.0.  All the points are actually there, I'm just turning visibility off for the points that don't make the cut.

    genepoints

    So, my next adventure is .... how do you handle this data better than this?

    Ooops - it's been a while...

    Sorry for the massive delay in blogging.  I've had a number of changes going on in life and blogging took a back seat for a while.

    First up, I took a new job.  I'll be working with Microsoft's Life Science team, focused on Pharma/Life Science companies in the East.  So far I really love this new role - it's right up my alley.  Lots of visualization and lots of new and interesting problems to solve.  I will attempt to keep the visualizations somewhat generic so that they can be used in a wide array of industries, but don't be too surprised if the data looks strangely chemical in nature :)

    Secondly, Microsoft has allowed me to teach a Programming Course to some High School students at a local, private Christian Academy.  The Students are great, very sharp, and we are all having a great deal of fun in the class.  I've wanted to teach for over 15 years, but the timing never seemed right.  Microsoft is a great company for letting me do this and it's one of the things that keeps me there.

    Lastly, I'm doing a lot of work with a gentleman named Anirban Ghosh, an employee of InfoSys Life Science.  He's one of the nicest and smartest guys you will ever meet.  In fact, every time I talk to him I feel like I'm going back to college to get a degree in Chemistry or Biology.  He is insanely smart and we are doing some great work together.