Introduction

Voxel arrays can come in different forms, either as NIfTI files, NRRD files, RAW files or directly imported in R as such.

These arrays host a value of 0 for “empty space”, and another numerical value corresponding to a structural ID in any other position.

Once the user has added the relevant information, such as orientation, units of measurement, voxel size, etc (all optional), the package “slices” the array along the 3 anatomical axes. For the time being, we will always assume that the orientation of the array is RAS, i.e. from Left to Right (x axis), from Posterior to Anterior (y axis) and from Inferior to Superior (z axis). If you are importing an array that was created in a different orientation, such as PIR or LPS, you can use an argument in the functions to reorient the array to RAS. RAS being the default orientation means that the anatomical planes will always be in this order: sagittal, coronal and axial.

Once the array has been sliced, the package computes the polygon(s) for each structure, i.e. for every group of pixels in each slice bearing the same numerical value. Polygons are computed through a marching squares algorithm (as implemented in the isoband package) which assigns non-consecutive polygons to different groups.

Since some of the polygons can have holes in a specific slice - as is the case, for instance, of the forebrain white matter in the human segmentation - the package functions automatically detect which polygons within the same structure are contained by bigger polygons, and treat them as holes in draw time.

Once the drawing procedure is finished, a segmentation object is created which can be plotted and accessed in different ways.

Drawing a segmentation

In this package we refer to 2 different types of operations: drawing is the procedure which effectively constructs polygons, whereas plotting consists in creating plots where polygons are selected and coloured according to user-defined rules.

Therefore the creation of a segmentation object is achieved by calling seg_draw().

To use this function we need to have access to 2 files: the voxel array and the ontology table. We then have to instruct the function on whether array rotations are required, and add other types of metadata on the fly. In this example, we use the a NiFTi (ABA_Human_half.nii.gz) file from the Allen Institute for Brain Sciences which we can find here together with its ontology, containing a 500μ-spaced voxel array with 141 annotated structures. We add the ontology file ABA_Human_ontology.csv as well. We add the MNI152 reference space to the metadata. Since this NIfTI file has already some relevant information in its header, the function will add it for us.

seg <- seg_draw(nifti_file = "ABA_Human_half.nii.gz", 
                        ontology_file = "ABA_Human_ontology.csv",
                        reference_space = "MNI152",
                        verbose = FALSE)
                        
seg

Segmentation from file:  ABA_Human_half.nii.gz 
                    Plane(s):  sagittal, coronal, axial 
                  Directions:  RAS 
       Dimensions (original):  394 x 466 x 378 
      Dimensions (effective):  148 x 362 x 310 
            Pixel dimensions:  0.5 x 0.5 x 0.5 
             Dimension units:  mm 
             Reference space:  MNI152 
              Structures (n):  141 
             Structures (ID):  10368, 12805, 10557, 12369, 10460, 10602 and 135 more. 
       Structures (acronyms):  BL, 4V, FWM, Aq, Pin, 3V and 135 more. 
Type citations("name_of_segmentation") to display the citation(s) for this segmentation.

Inspecting a segmentation ontology

We can access the ontology slot using the ontology() function:

head(ontology(seg))

         id atlas_id                       name acronym st_level ontology_id hemisphere_id weight parent_structure_id depth graph_id graph_order
10153 10153       NA               neural plate      NP       NA          11             3   8660                  NA     0       16           0
10154 10154       NA                neural tube      NT       NA          11             3   8660               10153     1       16           1
10155 10155       NA                      brain      Br       NA          11             3   8660               10154     2       16           2
10156 10156       NA forebrain (prosencephalon)       F       NA          11             3   8660               10155     3       16           3
10157 10157       NA   gray matter of forebrain     FGM       NA          11             3   8660               10156     4       16           4
10158 10158       NA              telencephalon     Tel       NA          11             3   8660               10157     5       16         657
                          structure_id_path color_hex_triplet neuro_name_structure_id neuro_name_structure_id_path failed sphinx_id structure_name_facet
10153                               /10153/            D7D8D8                      NA                           NA      f      6600           3041346888
10154                         /10153/10154/            D0D0D1                      NA                           NA      f      6601           1172862311
10155                   /10153/10154/10155/            E0E0E0                      NA                           NA      f      6602           3016132225
10156             /10153/10154/10155/10156/            EBD6D0                      NA                           NA      f      6603           2015023488
10157       /10153/10154/10155/10156/10157/            EBD6D0                      NA                           NA      f      6604            545890574
10158 /10153/10154/10155/10156/10157/10158/            EBD6D0                      NA                           NA      f      7257           2480649783
      failed_facet                  safe_name     col
10153    734881840               neural plate #D7D8D8
10154    734881840                neural tube #D0D0D1
10155    734881840                      brain #E0E0E0
10156    734881840 forebrain (prosencephalon) #EBD6D0
10157    734881840   gray matter of forebrain #EBD6D0
10158    734881840              telencephalon #EBD6D0

This ontology was downloaded from the Allen Brain Atlas API, and it contains several pieces of information that we will not need. However, we do need the id, name, acronym, parent_structure_id, structure_id_path and col fields.

Ontologies may contain additional IDs, which are part of how the authors of the segmentation have classified structures into higher order groupings (e.g. “hipothalamus” as a higher order grouping for several hipothalamic nuclei). This ontology can be visualized as a tree using ontology_plot():

ABA_Human_ontology_small

Acronyms can be compared to their names looking at the ontology.

Plotting a segmentation

The basic functionality with no external datasets allows to plot all structures within specific slices. We pick 3 random slices with the s_slice, c_slice and a_slice arguments for sagittal, coronal and axial respectively.

seg_plot(seg, s_slice = 50, c_slice = 60, a_slice = 30)

ABA_Human_half_allplanes_structures

We can also choose only one axis, and we can show structural labels (as acronyms from the ontology table):

seg_plot(seg, s_slice = 50, show_labels = TRUE)

ABA_Human_half_sagittal_structures

Additionally, we can subset a segmentation object by structure(s). Here is an example in which we isolate the Forebrain White Matter (FWM):

seg_sub <- seg_sub_str(segmentation = seg, planes_chosen = "sagittal", structures = "FWM")

seg_plot(seg_sub, s_slice = 50, show_labels = TRUE)

ABA_Human_half_FWM_structuresubset

Creating a segmentationAssay

The segmentationAssay S4 class was created to host an external dataset, adding numerical tables and other information to a segmentation. For instance, in the case of GTEx brain data, a segmentationAssay can host the median TPM count table (in the values slot), additional information on the samples (in the sampledata slot) and, most importantly, the mapping from samples to structures (mapping slot).

Depending on the segmentation, the assay may require a one-to-many mapping of samples to structures. For instance, the GTEx brain dataset contains “caudate basal ganglia”, which has to be mapped to 3 different structural IDs in the Allen Human Brain Atlas segmentation.

This mapping has to be manually curated by the user, as there is no way to detect automatically this sort of correspondences.

Here we will show how to create this mapping for the GTEx bulk RNA-seq data, using median gene-level TPM values.

These values can be found here, in particular the GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_median_tpm.gct.gz file, which we will need to unzip.

# read the table
gtex <- read.table("gtex_tpm.gct", header = TRUE, sep = "\t", skip = 2)

# subsetting and wrangling names
gtex <- gtex[,c(1, 2, grep("Brain", colnames(gtex)))]
colnames(gtex) <- gsub(colnames(gtex), pattern = "Brain...", replacement = "")
colnames(gtex) <- gsub(colnames(gtex), pattern = "[.]", replacement = "_")
colnames(gtex) <- gsub(colnames(gtex), pattern = "__", replacement = "_")
colnames(gtex) <- gsub(colnames(gtex), pattern = "_$", replacement = "")

#removing duplicates (mostly 0 TPM genes)
gtex <- gtex[!duplicated(gtex$Description),]
rownames(gtex) <- gtex$Description

Once the table is subsetted and readable, we map samples to structural IDs creating a list:

# create mapping
brain_regions <- list()

brain_regions[["Amygdala"]] <- ontology(seg)[grep(pattern  ="amyg",
                                                  x = ontology(seg)$name), "acronym"]

brain_regions[["Cortex"]] <- ontology(seg)[grep(pattern  = "frontal",
                                                x = ontology(seg)$name), "acronym"]

brain_regions[["Frontal_Cortex_BA9"]] <- ontology(seg)[grep(pattern  = "cingulate gyrus, rostral",
                                                            x = ontology(seg)$name), "acronym"]

brain_regions[["Caudate_basal_ganglia"]] <- ontology(seg)[grep(pattern  = "caudate",
                                                               x = ontology(seg)$name), "acronym"]

brain_regions[["Putamen_basal_ganglia"]] <- ontology(seg)[grep(pattern  = "putamen",
                                                               x = ontology(seg)$name), "acronym"]

hth_region = ontology(seg)[which(ontology(seg)$acronym == "HTH"), "id"]
  
brain_regions[["Hypothalamus"]] <- ontology(seg)[which(ontology(seg)$parent_structure_id == hth_region), "acronym"]

brain_regions[["Substantia_nigra"]] <- ontology(seg)[grep(pattern  = "SN", x = ontology(seg)$acronym), "acronym"]

brain_regions[["Hippocampus"]] <- ontology(seg)[grep(pattern  = "hipp",
                                                     x = ontology(seg)$name), "acronym"]

brain_regions[["Nucleus_accumbens_basal_ganglia"]] <- ontology(seg)[grep(pattern  = "accumbens",
                                                                         x = ontology(seg)$name), "acronym"]

brain_regions[["Cerebellum"]] <- ontology(seg)[grep(pattern  = "cerebell",
                                                    x = ontology(seg)$name), "acronym"]

brain_regions[["Hippocampus"]] <- brain_regions[["Hippocampus"]][2:length(brain_regions[["Hippocampus"]])]

Then we make a very simple sampledata table since we don’t have many variables:


coldata_gtex = data.frame(sample_id = colnames(gtex)[3:ncol(gtex)], structure_acronym = colnames(gtex)[3:ncol(gtex)])
rownames(coldata_gtex) = coldata_gtex$sample_id

We now have everything we need to create a segmentationAssay object and add it to our segmentation:

gtexAssay <- new("segmentationAssay",
                 values = as.matrix(gtex[,3:ncol(gtex)]),
                 mapping = brain_regions,
                 sampledata = coldata_gtex)

seg <- seg_assay_add(segmentation = seg, 
                        assay = gtexAssay, 
                        name = "gtex")

Creating a maximum projection and plotting external data

When plotting data such as expression values in the segmentation, not all structures will be visible within the same slice, making the choice of a specific slice hard. For this reason we can create a maximum projection of structure slices on every plane, in both directions: slice-level polygons for every structure are joined together into single polygons and displayed in the order that they appear from both points of view, e.g. in the sagittal plane, from left to right (LR) and from right to left (RL).

Structures can be subset for maximum projections, since using all structures may just result in producing a side view of the segmentation (which may be a desired behaviour in some cases). Since maximum projections are specifically important for plotting datasets, we create a projection that has the same name as the dataset using the seg_projection_add() function, only for the sagittal plane:


structures_chosen <- unlist(assays(seg)$gtex@mapping)
structures_chosen <- intersect(structures_chosen, ontology(seg)$acronym)

  
seg <- seg_projection_add(name = "gtex", 
                        segmentation = seg, 
                        structures = structures_chosen, 
                        planes_chosen = "sagittal")

At this point we can plot data from one of our assays (GTEx) in the sagittal projection, using seg_feature_plot(), specifying the assay name and the gene (feature argument):

seg_feature_plot(segmentation = seg,
             assay = "gtex",
             feature = "APP")

ABA_Human_half_APP_featureplot

Building and rendering 3D meshes

coldcuts can reuse polygons to rebuild the original volume array and, using a marching cubes algorithm together with quadric edge decimation (both as implemented in Rvcg) it can quickly render simplified triangular meshes that can be visualized in an interactive rgl session.

In order to create and visualize 3D meshes for all our the structures in our segmentation, we only need 2 lines of code:


seg <- seg_meshlist_add(seg, verbose = FALSE)

seg_meshlist_render(seg)

ABA_Human_half_render

We can visualize only a subset of structures, e.g. all the structures of the frontal lobe (FroL in the ontology):

seg_meshlist_render(seg, subset_str = "FroL")

ABA_Human_half_FroL_render