Philosophy

This page is especially drafty, and its text may ultimately go elsewhere, or be revised.

LIVE's Power: Linking, Re-use, and Interactive Elements

Linkability

The glue approach makes it easy to specify arbitrary relationships between high-dimensional datasets without needing to rely on metadata (which is often missing in real-world datasets). Rather than needing to do a lot of data preprocessing to “merge” datasets into a single file or format, researchers can simply bring their individual datasets into glue and specify the relationships that exist between datasets. A simple example of this is that if two columns within a tabular dataset specify positions in an image, the researcher can simply tell glue that those columns mean pixel positions in the image; it doesn’t matter what the columns are called, and no of metadata is required to exist in either the image or tabular data. A more complicated example would be connecting two gene sets through an orthology table; glue can handle arbitrary dictionary lookups for ids from one dataset to another. Linking datasets in this way allows multiple datasets to be overplotted (e.g. a table of information on an image) and for subsets defined over one dataset to translate automatically to another (e.g. a gene subset in one species translating to the orthologous subset in another species). This ability to simply specify conceptual links is critical in the modern era where genomics datasets are large enough to be unwieldy to move around and it is undesirable to create merged datasets that duplicate the amount of disk space required.  

Reusability

The core glue viewers in Jupyter are ipywidgets – reactive viewers that can be displayed within a Jupyter notebook or on standalone web pages. Users can link datasets, create viewers, and explore their data with interactive selections and subsets through a full GUI built into glue jupyter lab, but Python-savvy researchers can also have these interactive viewers live alongside their analysis code and scripts, and can script all aspects of the GUI for repetitive tasks (such as data cleaning/quality control). Researchers can use the same tools to author a standalone webpage; arranging figures using the GUI or Python API will arrange them on a webpage, and the resulting site will be backed by the full power of glue. In this way, the same tools that are used for the messy, complicated process of discovery can be repurposed to share results with colleagues and the public.

Interactivity

Viewers in glue are inherently interactive; the power of glue lies in being able to make selections in one viewer and see those selections automatically in all other viewers where that selection makes sense. glue viewers can talk to all other ipywidgets, allowing users to integrate viewers with simple widgets (such as sliders) as well as arbitrarily complex widgets. A second aspect of interactivity is an interface to choose which views of linked data should zoom and pan together, or when zooming in one viewer should create zoomed-in detail views or retain miniature overviews of the full dataset for context. The semantic zoom capabilities of, e.g. gosling, and the multi-resolution/tiled formats served by Hi-Glass and made available through next-generation image formats (i.e. OME-ZARR) means that well-written viewers can visualize the very large datasets of modern genomics with high performance by requesting just the information necessary to display a portion of the data at the necessary resolution.