In this document, we show how to read brain imaging data from FreeSurfer binary files. These files are created and used by the FreeSurfer neuroimaging software suite to store volume data and surface morphometry data computed from MRI brain images.
Brain imaging data come in different formats. Typically, the data is acquired on a scanner that outputs a set of two-dimensional (2D) DICOM format images. The 2D images are often combined into a single file that holds a 3D or 4D stack of images for further processing. Common formats include ANALYZE, NIFTI, and the MGH format used by FreeSurfer.
This freesurferdata R package implements functions to parse data in MGH format, as well as some related FreeSurfer formats. MGH stands for Massachusetts General Hospital, and is a binary format. The MGZ format is a compressed version of the MGH format.
Note: To learn how to write neuroimaging data with this package, read the vignette Writing FreeSurfer neuroimaging data with freesurferformats that comes with this package.
Here is a first example for reading MGH data.
library("freesurferformats")
mgh_file = system.file("extdata", "brain.mgz", package = "freesurferformats", mustWork = TRUE)
brain = read.fs.mgh(mgh_file)
cat(sprintf("Read voxel data with dimensions %s. Values: min=%d, mean=%f, max=%d.\n", paste(dim(brain), collapse = 'x'), min(brain), mean(brain), max(brain)));
## Read voxel data with dimensions 256x256x256x1. Values: min=0, mean=7.214277, max=156.
Now, brain
is an n-dimensional matrix, where n depends on the data in the MGZ file. A conformed FreeSurfer volume like brain.mgz
typically has 4 dimensions and 256 x 256 x 256 x 1 = 16777216 voxels. The final dimension, which is 1 here, means it has only a single time point or frame. In this case, the file was compressed and in MGZ format, but the function does not care, it works for both MGH and MGZ.
If you need the header data, read on.
To access not only the volume data, but also the header, call read.fs.mgh
like this:
brain_with_hdr = read.fs.mgh(mgh_file, with_header = TRUE);
brain = brain_with_hdr$data; # as seen before, this is what we got in the last example (the data).
header = brain_with_hdr$header; # the header
Now you have acces to the following header data:
header$dtype # int, one of: 0=MRI_UCHAR; 1=MRI_INT; 3=MRI_FLOAT; 4=MRI_SHORT
header$ras_good_flag # int, 0 or 1. Whether the file contains a valid vox2ras matrix and ras_xform (see header$vox2ras_matrix below)
header$has_mr_params # int, 0 or 1. Whether the file contains mr_params (see header$mr_params below)
header$voldim # integer vector or length 4. The volume (=data) dimensions. E.g., c(256, 256, 256, 1) for 3D data.
If your MGH/MGZ file contains valid information on the vox2ras matrix and/or acquisition parameters (mr_params
), you can access them like this for the mr params:
if(header$has_mr_params) {
mr_params = header$mr_params;
cat(sprintf("MR acquisition parameters: TR [ms]=%f, filp angle [radians]=%f, TE [ms]=%f, TI [ms]=%f\n", mr_params[1], mr_params[2], mr_params[3], mr_params[4]));
}
## MR acquisition parameters: TR [ms]=2300.000000, filp angle [radians]=0.157080, TE [ms]=2.010000, TI [ms]=900.000000
And like this for the vox2ras_matrix
:
## [,1] [,2] [,3] [,4]
## [1,] -1 0 0 127.50005
## [2,] 0 0 1 -98.62726
## [3,] 0 -1 0 79.09527
## [4,] 0 0 0 1.00000
And finally the ras_xform
:
## [,1] [,2] [,3] [,4]
## [1,] -1 0 0 -0.4999542
## [2,] 0 0 1 29.3727417
## [3,] 0 -1 0 -48.9047318
## [4,] 0 0 0 1.0000000
The MGH/MGZ format is also used to store morphometry data mapped to standard space (fsaverage). In the following example, we read cortical thickness data in standard space, smoothed with a FWHM 25 kernel:
mgh_file = system.file("mystudy", "subject1", "surf", "lh.thickness.fwhm25.fsaverage.mgh")
cortical_thickness_standard = read.fs.mgh(mgh_file)
Now, cortical_thickness_standard
is a vector of n float values, where n is the number of vertices of the fsaverage subject’s left hemisphere surface (i.e., 163842 in FreeSurfer 6).
If all you need is to perform statistical analysis of the data in the MGH file, you are ready to do that after loading. If you need access to more image operations, I would recommend to convert the data to a NIFTI object. E.g., if you have oro.nifti installed, you could visualize the brain
data we loaded earlier like this:
Let’s read an example morphometry data file that comes with this package. It contains vertex-wise measures of cortical thickness for the left hemisphere of a single subject in native space.
library("freesurferformats")
curvfile = system.file("extdata", "lh.thickness", package = "freesurferformats", mustWork = TRUE)
ct = read.fs.curv(curvfile)
Now, ct
is a vector of n float values, where n is the number of vertices of the surface mesh the data belongs to (usually surf/lh.white
). The number of vertices differs between subjects, as this is native space data.
We can now have a closer look at the data and maybe plot a histogram of cortical thickness for this subject:
## Read data for 149244 vertices. Values: min=0.000000, mean=2.437466, max=5.000000.
The package provides a wrapper function to read morphometry data, no matter the format. It always returns data as a vector and automatically determines the format from the file name. Here we use the function to read the file from the last example:
morphfile1 = system.file("extdata", "lh.thickness", package = "freesurferformats", mustWork = TRUE)
thickness_native = read.fs.morph(morphfile1)
And here is an example for an MGZ file:
morphfile2 = system.file("extdata", "lh.curv.fwhm10.fsaverage.mgz", package = "freesurferformats", mustWork = TRUE)
curv_standard = read.fs.morph(morphfile2)
curv_standard[curv_standard < -1] = 0; # remove extreme outliers
curv_standard[curv_standard > 1] = 0;
hist(curv_standard, main="lh std curvature", xlab="Mean Curvature [mm^-1], fwhm10", ylab="Vertex count")
An annotation file contains a cortical parcellation for a subject, based on a brain atlas. It contains a label for each vertex of a surface, and that label assigns this vertex to one of a set of atlas regions. The file format also contains a colortable, which assigns a color code to each atlas region. An example file would be labels/lh.aparc.annot
for the aparc
(Desikan) atlas.
Let’s read an example annotation file that comes with this package:
annotfile = system.file("extdata", "lh.aparc.annot.gz", package = "freesurferformats", mustWork = TRUE);
annot = read.fs.annot(annotfile);
Note: The example file that comes with this package was gzipped to save space. While this is not typical for annot files, the read.fs.annot function handles it automatically if the filename ends with .gz.
As mentioned earlier, such a file contains various pieces of information. Let us investigate the labels and the atlas region names for some vertices first:
num_vertices_total = length(annot$vertices);
for (vert_idx in c(1, 5000, 123456)) {
cat(sprintf("Vertex #%d with zero-based index %d has label code '%d' which stands for atlas region '%s'\n", vert_idx, annot$vertices[vert_idx], annot$label_codes[vert_idx], annot$label_names[vert_idx]));
}
## Vertex #1 with zero-based index 0 has label code '9182740' which stands for atlas region 'lateraloccipital'
## Vertex #5000 with zero-based index 4999 has label code '3957880' which stands for atlas region 'pericalcarine'
## Vertex #123456 with zero-based index 123455 has label code '14474380' which stands for atlas region 'superiortemporal'
Now, we will focus on the colortable. We will list the available regions and their color codes.
ctable = annot$colortable$table;
regions = annot$colortable$struct_names;
for (region_idx in seq_len(annot$colortable$num_entries)) {
cat(sprintf("Region #%d called '%s' has RGBA color (%d %d %d %d) and code '%d'.\n", region_idx, regions[region_idx], ctable[region_idx,1], ctable[region_idx,2], ctable[region_idx,3], ctable[region_idx,4], ctable[region_idx,5]));
}
## Region #1 called 'unknown' has RGBA color (25 5 25 0) and code '1639705'.
## Region #2 called 'bankssts' has RGBA color (25 100 40 0) and code '2647065'.
## Region #3 called 'caudalanteriorcingulate' has RGBA color (125 100 160 0) and code '10511485'.
## Region #4 called 'caudalmiddlefrontal' has RGBA color (100 25 0 0) and code '6500'.
## Region #5 called 'corpuscallosum' has RGBA color (120 70 50 0) and code '3294840'.
## Region #6 called 'cuneus' has RGBA color (220 20 100 0) and code '6558940'.
## Region #7 called 'entorhinal' has RGBA color (220 20 10 0) and code '660700'.
## Region #8 called 'fusiform' has RGBA color (180 220 140 0) and code '9231540'.
## Region #9 called 'inferiorparietal' has RGBA color (220 60 220 0) and code '14433500'.
## Region #10 called 'inferiortemporal' has RGBA color (180 40 120 0) and code '7874740'.
## Region #11 called 'isthmuscingulate' has RGBA color (140 20 140 0) and code '9180300'.
## Region #12 called 'lateraloccipital' has RGBA color (20 30 140 0) and code '9182740'.
## Region #13 called 'lateralorbitofrontal' has RGBA color (35 75 50 0) and code '3296035'.
## Region #14 called 'lingual' has RGBA color (225 140 140 0) and code '9211105'.
## Region #15 called 'medialorbitofrontal' has RGBA color (200 35 75 0) and code '4924360'.
## Region #16 called 'middletemporal' has RGBA color (160 100 50 0) and code '3302560'.
## Region #17 called 'parahippocampal' has RGBA color (20 220 60 0) and code '3988500'.
## Region #18 called 'paracentral' has RGBA color (60 220 60 0) and code '3988540'.
## Region #19 called 'parsopercularis' has RGBA color (220 180 140 0) and code '9221340'.
## Region #20 called 'parsorbitalis' has RGBA color (20 100 50 0) and code '3302420'.
## Region #21 called 'parstriangularis' has RGBA color (220 60 20 0) and code '1326300'.
## Region #22 called 'pericalcarine' has RGBA color (120 100 60 0) and code '3957880'.
## Region #23 called 'postcentral' has RGBA color (220 20 20 0) and code '1316060'.
## Region #24 called 'posteriorcingulate' has RGBA color (220 180 220 0) and code '14464220'.
## Region #25 called 'precentral' has RGBA color (60 20 220 0) and code '14423100'.
## Region #26 called 'precuneus' has RGBA color (160 140 180 0) and code '11832480'.
## Region #27 called 'rostralanteriorcingulate' has RGBA color (80 20 140 0) and code '9180240'.
## Region #28 called 'rostralmiddlefrontal' has RGBA color (75 50 125 0) and code '8204875'.
## Region #29 called 'superiorfrontal' has RGBA color (20 220 160 0) and code '10542100'.
## Region #30 called 'superiorparietal' has RGBA color (20 180 140 0) and code '9221140'.
## Region #31 called 'superiortemporal' has RGBA color (140 220 220 0) and code '14474380'.
## Region #32 called 'supramarginal' has RGBA color (80 160 20 0) and code '1351760'.
## Region #33 called 'frontalpole' has RGBA color (100 0 100 0) and code '6553700'.
## Region #34 called 'temporalpole' has RGBA color (70 20 170 0) and code '11146310'.
## Region #35 called 'transversetemporal' has RGBA color (150 150 200 0) and code '13145750'.
## Region #36 called 'insula' has RGBA color (255 192 32 0) and code '2146559'.
Keep in mind the indices when comparing results to those from other software: in GNU R, indices start with 1 but the FreeSurfer standard indices are zero-based:
r_index = 50; # one-based index as used by R and Matlab
fs_index = annot$vertices[r_index]; # zero-based index as used in C, Java, Python and many modern languages
cat(sprintf("Vertex at R index %d has FreeSurfer index %d and lies in region '%s'.\n", r_index, fs_index, annot$label_names[r_index]));
## Vertex at R index 50 has FreeSurfer index 49 and lies in region 'lateraloccipital'.
Let us retrieve some information on a specific region. We will reuse the thickness_native
data loaded above:
region = "bankssts"
thickness_in_region = thickness_native[annot$label_names == region]
cat(sprintf("Region '%s' has %d vertices and a mean cortical thickness of %f mm.\n", region, length(thickness_in_region), mean(thickness_in_region)));
## Region 'bankssts' has 1722 vertices and a mean cortical thickness of 2.485596 mm.
That’s all the information you can get from an annotation file.