Voxel arrays can come in different forms, either as NIfTI files, NRRD files, RAW files or directly imported in R as such.
These arrays host a value of 0 for “empty space”, and another numerical value corresponding to a structural ID in any other position.
Once the user has added the relevant information, such as orientation, units of measurement, voxel size, etc (all optional), the package “slices” the array along the 3 anatomical axes. For the time being, we will always assume that the orientation of the array is RAS, i.e. from Left to Right (x axis), from Posterior to Anterior (y axis) and from Inferior to Superior (z axis). If you are importing an array that was created in a different orientation, such as PIR or LPS, you can use an argument in the functions to reorient the array to RAS. RAS being the default orientation means that the anatomical planes will always be in this order: sagittal, coronal and axial.
Once the array has been sliced, the package computes the polygon(s)
for each structure, i.e. for every group of pixels in each slice bearing
the same numerical value. Polygons are computed through a marching
squares algorithm (as implemented in the isoband
package)
which assigns non-consecutive polygons to different groups.
Since some of the polygons can have holes in a specific slice - as is the case, for instance, of the forebrain white matter in the human segmentation - the package functions automatically detect which polygons within the same structure are contained by bigger polygons, and treat them as holes in draw time.
Once the drawing procedure is finished, a segmentation
object is created which can be plotted and accessed in different
ways.
In this package we refer to 2 different types of operations: drawing is the procedure which effectively constructs polygons, whereas plotting consists in creating plots where polygons are selected and coloured according to user-defined rules.
Therefore the creation of a segmentation
object is
achieved by calling seg_draw()
.
To use this function we need to have access to 2 files: the voxel
array and the ontology table. We then have to instruct the function on
whether array rotations are required, and add other types of metadata on
the fly. In this example, we use the a NiFTi
(ABA_Human_half.nii.gz
) file from the Allen Institute for
Brain Sciences which we can find here
together with its ontology, containing a 500μ-spaced voxel array with
141 annotated structures. We add the ontology file
ABA_Human_ontology.csv
as well. We add the MNI152 reference
space to the metadata. Since this NIfTI file has already some relevant
information in its header, the function will add it for us.
<- seg_draw(nifti_file = "ABA_Human_half.nii.gz",
seg ontology_file = "ABA_Human_ontology.csv",
reference_space = "MNI152",
verbose = FALSE)
seg
: ABA_Human_half.nii.gz
Segmentation from filePlane(s): sagittal, coronal, axial
: RAS
DirectionsDimensions (original): 394 x 466 x 378
Dimensions (effective): 148 x 362 x 310
: 0.5 x 0.5 x 0.5
Pixel dimensions: mm
Dimension units: MNI152
Reference spaceStructures (n): 141
Structures (ID): 10368, 12805, 10557, 12369, 10460, 10602 and 135 more.
Structures (acronyms): BL, 4V, FWM, Aq, Pin, 3V and 135 more.
citations("name_of_segmentation") to display the citation(s) for this segmentation. Type
We can access the ontology
slot using the
ontology()
function:
head(ontology(seg))
id atlas_id name acronym st_level ontology_id hemisphere_id weight parent_structure_id depth graph_id graph_order10153 10153 NA neural plate NP NA 11 3 8660 NA 0 16 0
10154 10154 NA neural tube NT NA 11 3 8660 10153 1 16 1
10155 10155 NA brain Br NA 11 3 8660 10154 2 16 2
10156 10156 NA forebrain (prosencephalon) F NA 11 3 8660 10155 3 16 3
10157 10157 NA gray matter of forebrain FGM NA 11 3 8660 10156 4 16 4
10158 10158 NA telencephalon Tel NA 11 3 8660 10157 5 16 657
structure_id_path color_hex_triplet neuro_name_structure_id neuro_name_structure_id_path failed sphinx_id structure_name_facet10153 /10153/ D7D8D8 NA NA f 6600 3041346888
10154 /10153/10154/ D0D0D1 NA NA f 6601 1172862311
10155 /10153/10154/10155/ E0E0E0 NA NA f 6602 3016132225
10156 /10153/10154/10155/10156/ EBD6D0 NA NA f 6603 2015023488
10157 /10153/10154/10155/10156/10157/ EBD6D0 NA NA f 6604 545890574
10158 /10153/10154/10155/10156/10157/10158/ EBD6D0 NA NA f 7257 2480649783
failed_facet safe_name col10153 734881840 neural plate #D7D8D8
10154 734881840 neural tube #D0D0D1
10155 734881840 brain #E0E0E0
10156 734881840 forebrain (prosencephalon) #EBD6D0
10157 734881840 gray matter of forebrain #EBD6D0
10158 734881840 telencephalon #EBD6D0
This ontology was downloaded from the Allen Brain Atlas API, and it
contains several pieces of information that we will not need. However,
we do need the id
, name
, acronym
,
parent_structure_id
, structure_id_path
and
col
fields.
Ontologies may contain additional IDs, which are part of how the
authors of the segmentation have classified structures into higher order
groupings (e.g. “hipothalamus” as a higher order grouping for several
hipothalamic nuclei). This ontology can be visualized as a tree using
ontology_plot()
:
ontology_plot(seg)
Acronyms can be compared to their names looking at the ontology.
The basic functionality with no external datasets allows to plot all
structures within specific slices. We pick 3 random slices with the
s_slice
, c_slice
and a_slice
arguments for sagittal, coronal and axial respectively.
seg_plot(seg, s_slice = 50, c_slice = 60, a_slice = 30)
We can also choose only one axis, and we can show structural labels (as acronyms from the ontology table):
seg_plot(seg, s_slice = 50, show_labels = TRUE)
Additionally, we can subset a segmentation
object by
structure(s). Here is an example in which we isolate the Forebrain White
Matter (FWM):
seg_sub <- seg_sub_str(segmentation = seg, planes_chosen = "sagittal", structures = "FWM")
seg_plot(seg_sub, s_slice = 50, show_labels = TRUE)
segmentationAssay
The segmentationAssay
S4 class was created to host an
external dataset, adding numerical tables and other information to a
segmentation. For instance, in the case of GTEx brain data, a
segmentationAssay
can host the median TPM count table (in
the values
slot), additional information on the samples (in
the sampledata
slot) and, most importantly, the mapping
from samples to structures (mapping
slot).
Depending on the segmentation, the assay may require a one-to-many mapping of samples to structures. For instance, the GTEx brain dataset contains “caudate basal ganglia”, which has to be mapped to 3 different structural IDs in the Allen Human Brain Atlas segmentation.
This mapping has to be manually curated by the user, as there is no way to detect automatically this sort of correspondences.
Here we will show how to create this mapping for the GTEx bulk RNA-seq data, using median gene-level TPM values.
These values can be found here, in particular the GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_median_tpm.gct.gz file, which we will need to unzip.
# read the table
gtex <- read.table("gtex_tpm.gct", header = TRUE, sep = "\t", skip = 2)
# subsetting and wrangling names
gtex <- gtex[,c(1, 2, grep("Brain", colnames(gtex)))]
colnames(gtex) <- gsub(colnames(gtex), pattern = "Brain...", replacement = "")
colnames(gtex) <- gsub(colnames(gtex), pattern = "[.]", replacement = "_")
colnames(gtex) <- gsub(colnames(gtex), pattern = "__", replacement = "_")
colnames(gtex) <- gsub(colnames(gtex), pattern = "_$", replacement = "")
#removing duplicates (mostly 0 TPM genes)
gtex <- gtex[!duplicated(gtex$Description),]
rownames(gtex) <- gtex$Description
Once the table is subsetted and readable, we map samples to structural IDs creating a list:
# create mapping
brain_regions <- list()
brain_regions[["Amygdala"]] <- ontology(seg)[grep(pattern ="amyg",
x = ontology(seg)$name), "acronym"]
brain_regions[["Cortex"]] <- ontology(seg)[grep(pattern = "frontal",
x = ontology(seg)$name), "acronym"]
brain_regions[["Frontal_Cortex_BA9"]] <- ontology(seg)[grep(pattern = "cingulate gyrus, rostral",
x = ontology(seg)$name), "acronym"]
brain_regions[["Caudate_basal_ganglia"]] <- ontology(seg)[grep(pattern = "caudate",
x = ontology(seg)$name), "acronym"]
brain_regions[["Putamen_basal_ganglia"]] <- ontology(seg)[grep(pattern = "putamen",
x = ontology(seg)$name), "acronym"]
hth_region = ontology(seg)[which(ontology(seg)$acronym == "HTH"), "id"]
brain_regions[["Hypothalamus"]] <- ontology(seg)[which(ontology(seg)$parent_structure_id == hth_region), "acronym"]
brain_regions[["Substantia_nigra"]] <- ontology(seg)[grep(pattern = "SN", x = ontology(seg)$acronym), "acronym"]
brain_regions[["Hippocampus"]] <- ontology(seg)[grep(pattern = "hipp",
x = ontology(seg)$name), "acronym"]
brain_regions[["Nucleus_accumbens_basal_ganglia"]] <- ontology(seg)[grep(pattern = "accumbens",
x = ontology(seg)$name), "acronym"]
brain_regions[["Cerebellum"]] <- ontology(seg)[grep(pattern = "cerebell",
x = ontology(seg)$name), "acronym"]
brain_regions[["Hippocampus"]] <- brain_regions[["Hippocampus"]][2:length(brain_regions[["Hippocampus"]])]
Then we make a very simple sampledata
table since we
don’t have many variables:
coldata_gtex = data.frame(sample_id = colnames(gtex)[3:ncol(gtex)], structure_acronym = colnames(gtex)[3:ncol(gtex)])
rownames(coldata_gtex) = coldata_gtex$sample_id
We now have everything we need to create a
segmentationAssay
object and add it to our
segmentation:
gtexAssay <- new("segmentationAssay",
values = as.matrix(gtex[,3:ncol(gtex)]),
mapping = brain_regions,
sampledata = coldata_gtex)
seg <- seg_assay_add(segmentation = seg,
assay = gtexAssay,
name = "gtex")
When plotting data such as expression values in the segmentation, not all structures will be visible within the same slice, making the choice of a specific slice hard. For this reason we can create a maximum projection of structure slices on every plane, in both directions: slice-level polygons for every structure are joined together into single polygons and displayed in the order that they appear from both points of view, e.g. in the sagittal plane, from left to right (LR) and from right to left (RL).
Structures can be subset for maximum projections, since using all
structures may just result in producing a side view of the segmentation
(which may be a desired behaviour in some cases). Since maximum
projections are specifically important for plotting datasets, we create
a projection that has the same name as the dataset using the
seg_projection_add()
function, only for the sagittal
plane:
structures_chosen <- unlist(assays(seg)$gtex@mapping)
structures_chosen <- intersect(structures_chosen, ontology(seg)$acronym)
seg <- seg_projection_add(name = "gtex",
segmentation = seg,
structures = structures_chosen,
planes_chosen = "sagittal")
At this point we can plot data from one of our assays (GTEx) in the
sagittal projection, using seg_feature_plot()
, specifying
the assay
name and the gene (feature
argument):
seg_feature_plot(segmentation = seg,
assay = "gtex",
feature = "APP")
coldcuts
can reuse polygons to rebuild the original
volume array and, using a marching cubes algorithm together with quadric
edge decimation (both as implemented in Rvcg
) it can
quickly render simplified triangular meshes that can be visualized in an
interactive rgl
session.
In order to create and visualize 3D meshes for all our the structures in our segmentation, we only need 2 lines of code:
seg <- seg_meshlist_add(seg, verbose = FALSE)
seg_meshlist_render(seg)
We can visualize only a subset of structures, e.g. all the structures of the frontal lobe (FroL in the ontology):
seg_meshlist_render(seg, subset_str = "FroL")