|  Panmo: Features

 Incorporating recent advances in statistical computing,
computer graphics, machine learning, and user interface design,
Panmo has the following features: - Tools to generate information-rich visualizations:Scatterplot, 2D histogram, boxplot, table plot, histogram, barplot,
variable resolution bivariate plot, Tukey sum-difference plot, profile plot (or parallel coordinate plot), X-rayed profile plot, 3D point cloud rotation, various types of trellis graphs, and browse.
- Powerful numerical tools:SOM (Self-Organizing Map), K-means, hierarchical clustering,
principal component analysis, classification and regression trees,
sign test, minimal spanning tree planing, multivariate Smirnov test, various profile searching tools, and
lowess smoothing. All these numerical tools present their result
graphically and are seamless integrated with the intuitive user
environment via dynamic graphics.
- Reporting tools:PostScript/PDF files of publishing/production
quality and PNG images can be generated at the click of one finger
tip. The whole flow of data analysis and user's train of thought can
be recorded and published on the internet easily.
- Multi-window interactive dynamic graphics
- Focusing and linking
- Graphical query formulation
- The most advanced graphics to cope for overstriking
- Logical zooming
- On-line context help
- Object-oriented data representation
- Three layers of user interface
- Smart menu system
- Inspection
- HCS image module
- HCS impressionist density plot module
The combined power of Panmo's tools and operations working together
is multiplicative, instead of being additive
in traditional informatics systems. A bit more details about the above features follow: Multi-window interactive dynamic graphics Visualizing big and complex data sets using a single display window
is difficult at best.
Multiple windows are used in Panmo
to provide users with simultaneous multiple views into data space.
Each display window serves as a 2-way communication link
between the system and users:
the system shows data in display windows and
users can look at a display window and,
by interacting with graphical symbols in the display window,
issue commands to the system. Top Focusing and linkingTo display complicated information,
like that contained in a big and complex data set,
a common instinct is to draw a plot that is equally complicated,
such as presenting the data as a tableau of Chernoff faces.
Attempts at such dense encoding are seldom successful.
It is usually more effective to construct
a number of simple, easy to understand plots,
each focused clearly on a particular aspect of the underlying data.
Each plot conveys partial information about the data.
Panmo integrates the information in multiple plots
into a coherent image of the data as a whole
by linking the contents of individual plots.
Painting is one of many interactive techniques available in Panmo
to link contents in plots.
The following 2 plots based on the yeast microarray data
illustrate the basic idea. | Panel A: SOM output |  | Panel B: 3D rotation of the yeast data plotted on
the first 3 principal components. | A set of genes with gradual increase in expression level
during diauxic shift and relatively stable expression level
during sporulation in the SOM output was painted red.
You can now see where they fall in the 3D space spanned
by their first 3 principal components.
This linkage allows us to relate the cluster assignment
of a gene with its location in the data space.
(and vice versa).Acknowledgement: The data used for the above 2 plots are from Spellman et al. (1998) and are available at the Yeast Cell Cycle Analysis Project website at Stanford University. Top Graphical query formulationPlots in traditional data analysis systems
only serve as passive, one-way communication
links from the system to the user.
There are few ways for users to interact with these systems
through plots.
Panmo considers components of plots
as visual representations of underlying entities
(e.g., cells, credit card customers, data sets, etc.),
which opens entirely new possibilities for interaction.
With such an arrangement,
users can look at a plot and,
by interacting with graphical symbols in the plot,
initiate the retrieval of data from the underlying database.
Analysis routines to be applied to the data
can be selected graphically.
The results of the analysis or data retrieval can again be in
the form of plots and ready for further graphical manipulation.
With Panmo,
if you see any pattern of interest in a plot,
you can retrieve the data generating the pattern immediately.
Panmo's way of graphical query formulation is especially useful
when there are patterns that are apparent in plots
but are tedious or difficult to describe with a textual query language. Top The most advanced graphics to cope for overstrikingScatterplots are the method of choice for displaying the distribution
of points in two dimensions.
They are used to discover patterns
such as holes, outliers, modes, and association
between the two variables.
A common problem is overstriking,
the overlap on the plotting surface of glyphs
representing individual observations.
Overstriking can create a misleading impression
of the data distribution.
As a simple measure to cope for overstriking,
the default plotting glyphs in
Panmo's scatterplots are unfilled circles. If there is only partial overlap and no exact overlap,
using an unfilledcircle as the plotting glyph can improve
the distinguishability of individual points. Filled circles do not share this property and can be
very misleading, as illustrated by the following plots.Panel A and Panel B display 40 points each.
Panel A only reveals points not buried by other points and
does not give any sense of density.
A glance at Panel B will alarm viewers that there are many
points closely together.
Panel C is a composite of Panel A and Panel B for better comparison.
You'll have to make your web browser repeat animation continuously
to properly see Panel C.
As a more sophisticated measure to cope for overstriking
in scatterplots, the novel variable resolution bivariate plots (or Varebi plots for short) in Panmo deals with the problem of overstriking
by mixing display of a density estimate and display of
individual observations. The idea is to determine the display format
by analyzing the actual amount of overstriking on the screen.
Thus, the display format will depend on
the sample size, the distribution of the observations,
the size and shape of individual icons, and the size of the window.
It may change automatically when the window is resized.
Varebi plots reveal detail wherever possible,
and show the overall trend when displaying detail is not feasible.
Here is an example comparing a scatterplot with a varebi plot (Data Credit: Professor P. Dee Boersma):

Here is an example of a varebi plot with different amount of
drawing space (Data Credit: Professor P. Dee Boersma):
 You'll have to make your web browser repeat animation continuously
to properly see these 2 plots.
If there is no serious overstriking,
the appearance of a varebi plot will approach that of a scatterplot
as the drawing area is increased.
Other areas where Panmo takes special care to deal with overstriking
are:- Boxplots use jittering to alleviate the problem of overstriking.
 |  | | Traditional way to draw a boxplot. | Panmo's way to draw a boxplot. |
- X-ray like images of parallel coordinate plots reveal
the internal structure obscured by overstriking.
Top Logical zoomingZooming is one way to get a better look at a portion of the data
in a display window.
There are two types of zooming: geometric zooming and logical zooming.
Geometric zooming produces a blown up version of the region
in which each pixel in the source image is represented
by a small square of the same color.
Logical zooming produces a plot based on the actual observations
in a source region.
As a result, more details can be revealed.
We'd like to zoom into the area bounded
by the blue rectangle.
This is the result of a geometric zooming.
This is the result of a logical zooming.
Logical zooming in Panmo is recursive, which means
users can zoom into a plot resulting from a previous logical
zooming again. The power of magnification and the area zoomed into
can be changed on the fly. Top On-line context helpUsers can press F1 or Shift-F1 to conveniently get context-sensitive help messages.
All help messages are displayed in a Netscape browser window. Top Object-oriented data representationObject-oriented data representation
allows easy mapping of conceptual entities in the problem domain
into computational entities in the system domains. Top Three layers of user interfaceUser interface of Panmo consists of
graphical direct manipulation, menus, and textual commands.
A textual user interface can be concise and accurate.
If a task needs to be carried out repeatedly,
this can be simplified by writing a script.
A menu based user interface is easy to master.
Graphic direct manipulation is good at taking advantage of patterns
that are apparent on the screen but are tedious or difficult
to describe in a textual command language.
Any one of the three alone is not enough
to gracefully handle all users' requests.
There is usually more than one type of user interface
suitable for a task;
Panmo lets users determine for themselves
the most convenient way to accomplish a task. Top Smart menu systemPanmo knows what type of menu to use for a tool and only
includes relevant items in the menu.
This greatly speeds up work flow and reduces finger fatigue. Top InspectionUsers can click any graphical icons
and get detailed information on data they represent. Top HCS image moduleWith this module, all plots are fully linked to the original scan images.
You can get the original scan images of the cells from any graphical
symbols in any plots. Data of any cells in a scan image can be instantly
retrieved to pass to any analytic function or to make any plot.
For example, the following DNA profile plot was based on the data
grabbed out of Panel A; the green cells in the DNA profile plot were
grabbed out again to make Panel B. All these were done
with only a few mouse clicks.
 |  |  | | Panel A | DNA Profile | Panel B |
For a more extensive demonstration of this HCS image module,
please take a look at here. Top HCS impressionist density plot moduleImpressionist density plots provide a compact, eye-pleasing way to
compare the data from different treatments in HCS experiments.
For examples:
For a more extensive demonstration of this HCS impressionist density
plot module, please take a look at here. Top 3D point cloud rotation You'll have to make your web browser repeat animation continuously to
properly see this plot. Top
Minimal spanning tree planingMinimal spanning tree (MST) planing is also known as multivariate
planing. MST planing is similar in intent to
multidimensional scaling (MDS) onto a 2-dimensional plane,
which tries to find N points in the 2-dimensional plane such
that the inter-point distances in the plane match the inter-point
distances in high-dimensional space. MST planing is much
faster to compute than MDS because it does not require
nonlinear optimization. Planing 40,000 points is a perfectly
reasonable task.
Principal component analysis is just "poor man's MDS."
The following 2 plots are the result of MST planing of 5431 points
from a 14-dimensional space:
Acknowledgement: The data used for the above 2 plots are from Spellman et al. (1998) and are available at the Yeast Cell Cycle Analysis Project website at Stanford University. Top Parallel Coordinate Plot (Profile Plot)Parallel coordinate plots take a unique approach to draw
high-dimensional data.
Since plotting more than 3 mutually perpendicular axes is impossible,
parallel coordinate plots draw all the axes parallel to each other
and equally spaced in a 2D plane.
For example, the following 2 parallel coordinate plots are based on
the same data set:
 | This parallel coordinate plot
uses a common vertical range that encompasses all the
values of all the variables being visualized. |  | Each axis in this parallel coordinate plot uses a different scale so
that the minimum and the maximum values of the variable at
an axis are mapped to the bottom and the top of the available
drawing room. | Overstriking can frequently be a problem for parallel coordinate plots.
Panmo provides several novel ways to cope for it.
The following plot demonstrates one of them:
 | Notice the internal structure and the x-ray like appearance of
this plot. | Panmo uses parallel coordinate plots as the launch point for several
utilities that help users find similar or dissimilar observations.
Acknowledgement: The data used for the above 2 plots are from Spellman et al. (1998) and are available at the Yeast Cell Cycle Analysis Project website at Stanford University. Top Visualization tool: browseThis is a smart tool.
It knows what plots to draw no matter how many and what types of
variables are passed to it.
It cuts down dramatically the cognitive effort
required during graphical query formulation
because users don't have to worry about what types of plots
to look at.
All they have to do is know what variables to look at. Top Tukey sum-difference plot
|