Commands & Actions
Before we implement a CuBIDS
workflow, let’s define the terminology
and take a look at some of the commands available in the software.
More definitions
Key Group
A Key Group is a unique set of BIDS key-value pairs, excluding identifiers such as subject and session. For example the files:
bids-root/sub-1/ses-1/func/sub-1_ses-1_acq-mb_dir-PA_task-rest_bold.nii.gz
bids-root/sub-1/ses-2/func/sub-1_ses-2_acq-mb_dir_PA_task-rest_bold.nii.gz
bids-root/sub-2/ses-1/func/sub-2_ses-1_acq-mb_dir-PA_task-rest_bold.nii.gz
Would all share the same Key Group. If these scans were all acquired as a part of the same study on the same scanner with exactly the same acquisition parameters, this naming convention would suffice.
However, in large multi-scanner, multi-site, or longitudinal studies where acquisition parameters change over time, it’s possible that the same Key Group could comprise of scans that differ in important ways.
CuBIDS
examines all acquisitions within a Key Group to see if there are any images
that differ in a set of important acquisition parameters. The subsets of consistent
acquisition parameter sets within a Key Group are called a Parameter Group.
Parameter Group
Even though two images may belong to the same Key Group and are valid BIDS, they
may have images with different acquisition parameters. There is nothing fundamentally
wrong with this — the bids-validator
will often simply flag these differences,
with a Warning
, but not necessarily suggest changes. That being said,
there can be detrimental consequences downstream if the different parameters cause the
same preprocessing pipelines to configure differently to images of the same Key Group.
Acquisition Group
Acquisition Groups are sets of subjects who’s images belong to all the same Key and Parameter Groups. The Acquistion Groups that subjects belong to are listed in _AcqGrouping.csv
, while the Key Groups and Parameter Groups that define each Acquisition Group are noted in _AcqGroupingInfo.txt
.
Acquisition Group
We define an “Acquisition Group” as a collection of sessions across participants that contain the exact same set of Key and Parameter Groups. Since Key Groups are based on the BIDS filenames—and therefore both MRI image type and acquisition specific—each BIDS session directory contains images that belong to a set of Parameter Groups. CuBIDS assigns each session––or set of Parameter Groups––to an Acquisition Group such that all sessions in an Acquisition Group possesses an identical set of scan acquisitions and metadata parameters across all image modalities present in the dataset. We find Acquisition Groups to be a particularly useful categorization of BIDS data, as they identify homogeneous sets of sessions (not individual scans) in a large dataset. They are also useful for expediting the testing of pipelines; if a BIDS App runs successfully on a single subject from each Acquisition Group, one can be confident that it will handle all combinations of scanning parameters in the entire dataset.
The _summary.tsv
File
This file contains all the detected Key Groups and Parameter Groups. It provides an opportunity to evaluate your data and decide how to handle heterogeneity.
Below is an example _summary.tsv
of the run-1 DWI Key Group in the PNC [1]. This
reflects the original data that has been converted to BIDS using a heuristic. It is
similar to what you will see when you first use this functionality:
Notes |
ManualCheck |
MergeInto |
RenameKeyGroup |
KeyParamGroup |
KeyGroup |
ParamGroup |
Counts |
Dim1Size |
Dim2Size |
Dim3Size |
EchoTime |
EffectiveEchoSpacing |
FlipAngle |
HasFieldmap |
KeyGroupCount |
Modality |
NSliceTimes |
NumVolumes |
Obliquity |
ParallelReductionFactorInPlane |
PartialFourier |
PhaseEncodingDirection |
RepetitionTime |
TotalReadoutTime |
UsedAsFieldmap |
VoxelSizeDim1 |
VoxelSizeDim2 |
VoxelSizeDim3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
acquisition-VARIANTNoFmap_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__2 |
datatype-dwi_run-1_suffix-dwi |
2 |
25 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
FALSE |
1426 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
8.1 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
|||
acquisition-VARIANTRepetitionTime_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__3 |
datatype-dwi_run-1_suffix-dwi |
3 |
6 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
TRUE |
1426 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
9.0 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
|||
acquisition-VARIANTRepetitionTime_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__4 |
datatype-dwi_run-1_suffix-dwi |
4 |
3 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
TRUE |
1426 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
9.8 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
|||
acquisition-VARIANTDim3SizeVoxelSizeDim3_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__5 |
datatype-dwi_run-1_suffix-dwi |
5 |
2 |
128 |
128 |
46 |
0.082 |
0.000267 |
90 |
TRUE |
1426 |
dwi |
46 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
8.1 |
0.034 |
FALSE |
1.875 |
1.875 |
3.0 |
|||
acquisition-VARIANTEchoTimeEffectiveEchoSpacingRepetitionTimeTotalReadoutTime_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__6 |
datatype-dwi_run-1_suffix-dwi |
6 |
1 |
128 |
128 |
70 |
0.102 |
0.0008 |
90 |
TRUE |
1426 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
12.3 |
0.102 |
FALSE |
1.875 |
1.875 |
2.0 |
|||
acquisition-VARIANTObliquity_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__7 |
datatype-dwi_run-1_suffix-dwi |
7 |
1 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
TRUE |
1426 |
dwi |
70 |
35.0 |
TRUE |
3.0 |
0.75 |
j- |
8.1 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
The _files.tsv
file
This file contains one row per imaging file in the BIDS directory. You won’t need to edit this file directly, but it keeps track of every file’s assignment to Key and Parameter Groups.
Modifying Key and Parameter Group Assignments
Sometimes we see that there are important differences in acquisition parameters within a Key Group.
If these differences impact how a pipeline will process the data, it makes sense to assign the scans
in that Parameter Group to a different Key Group (i.e. assign them a different BIDS name). This can
be accomplished by editing the empty columns in the _summary.csv file produced by cubids-group
.
Once the columns have been edited you can apply the changes to BIDS data using
$ cubids-apply /bids/dir keyparam_edited new_keyparam_prefix
The changes in keyparam_edited_summary.csv
will be applied to the BIDS data in /bids/dir
and the new Key and Parameter groups will be saved to csv files starting with new_keyparam_prefix
. Note:
fieldmaps keygroups with variant parameters will be identified but not renamed.
The _AcqGrouping.tsv
file
The _AcqGrouping.tsv
file organizes the dataset by session and tags each one with its Acquisition Group number.
The _AcqGroupInfo.txt
file
The _AcqGroupInfo.txt
file lists all Key Groups that belong to a given Acquisition Group along with the number of sessions each group possesses.
Visualizing and summarizing metadata heterogenaity
Use cubids-group
to generate your dataset’s Key Groups and Parameter Groups:
$ cubids-group FULL/PATH/TO/BIDS/DIR FULL/PATH/TO/v0
This will output four files, including the summary and files tsvs described above,
prefixed by the second argument v0
.
Applying changes
The cubids-apply
program provides an easy way for users to manipulate their datasets.
Specifically, cubids-apply
can rename files according to the users’ specification in a tracked
and organized way. Here, the summary.tsv functions as an interface modifications; users can mark
Parameter Groups
they want to rename (or delete) in a dedicated column of the summary.tsv and
pass that edited tsv as an argument to cubids-apply
.
Detecting Variant Groups
Additionally, cubids-apply can automatically rename files in Variant Groups
based on their
scanning parameters that vary from those in their Key Groups’ Dominant Parameter Groups. Renaming
is automatically suggested when the summary.tsv is generated from a cubids-group run, with the suggested
new name listed in the tsv’s “Rename Key Group” column. CuBIDS populates this column for all Variant
Groups—e.g., every Parameter Group except the Dominant one. Specifically, CuBIDS will suggest renaming
all non-dominant Parameter Group to include VARIANT* in their acquisition field where * is the reason
the Parameter Group varies from the Dominant Group. For example, when CuBIDS encounters a Parameter
Group with a repetition time that varies from the one present in the Dominant Group, it will automatically
suggest renaming all scans in that Variant Group to include acquisition-VARIANTRepetitionTime
in their
filenames. When the user runs cubids-apply
, filenames will get renamed according to the auto-generated
names in the “Rename Key Group” column in the summary.tsv
Deleting a mistake
To remove files in a Parameter Group from your BIDS data, you simply set the MergeInto
value
to 0
. We see in our data that there is a strange scan that has a RepetitionTime
of 12.3
seconds and is also variant with respect to EffectiveEchoSpacing and EchoTime. We elect to remove this scan from
our dataset because we do not want these parameters to affect our analyses.
To remove these files from your BIDS data, add a 0
to MergeInto
and save the new tsv as v0_edited_summary.tsv
Notes |
ManualCheck |
MergeInto |
RenameKeyGroup |
KeyParamGroup |
KeyGroup |
ParamGroup |
Counts |
Dim1Size |
Dim2Size |
Dim3Size |
EchoTime |
EffectiveEchoSpacing |
FlipAngle |
HasFieldmap |
KeyGroupCount |
Modality |
NSliceTimes |
NumVolumes |
Obliquity |
ParallelReductionFactorInPlane |
PartialFourier |
PhaseEncodingDirection |
RepetitionTime |
TotalReadoutTime |
UsedAsFieldmap |
VoxelSizeDim1 |
VoxelSizeDim2 |
VoxelSizeDim3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
acquisition-VARIANTNoFmap_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__2 |
datatype-dwi_run-1_suffix-dwi |
2 |
25 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
FALSE |
1426 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
8.1 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
|||
acquisition-VARIANTRepetitionTime_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__3 |
datatype-dwi_run-1_suffix-dwi |
3 |
6 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
TRUE |
1426 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
9.0 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
|||
acquisition-VARIANTRepetitionTime_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__4 |
datatype-dwi_run-1_suffix-dwi |
4 |
3 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
TRUE |
1426 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
9.8 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
|||
acquisition-VARIANTDim3SizeVoxelSizeDim3_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__5 |
datatype-dwi_run-1_suffix-dwi |
5 |
2 |
128 |
128 |
46 |
0.082 |
0.000267 |
90 |
TRUE |
1426 |
dwi |
46 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
8.1 |
0.034 |
FALSE |
1.875 |
1.875 |
3.0 |
|||
0 |
acquisition-VARIANTEchoTimeEffectiveEchoSpacingRepetitionTimeTotalReadoutTime_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__6 |
datatype-dwi_run-1_suffix-dwi |
6 |
1 |
128 |
128 |
70 |
0.102 |
0.0008 |
90 |
TRUE |
1426 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
12.3 |
0.102 |
FALSE |
1.875 |
1.875 |
2.0 |
||
acquisition-VARIANTObliquity_datatype-dwi_run-1_suffix-dwi |
datatype-dwi_run-1_suffix-dwi__7 |
datatype-dwi_run-1_suffix-dwi |
7 |
1 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
TRUE |
1426 |
dwi |
70 |
35.0 |
TRUE |
3.0 |
0.75 |
j- |
8.1 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
In this example, users can apply the changes to BIDS data using the following command:
$ cubids-apply FULL/PATH/TO/BIDS/DIR FULL/PATH/TO/v0_edited_summary.tsv FULL/PATH/TO/v0_files.tsv FULL/PATH/TO/v1
The changes in v0_edited_summary.tsv
will be applied to the BIDS data
and the new Key and Parameter Groups will be saved to tsv files starting with v1
.
Applying these changes we would see:
Notes |
ManualCheck |
MergeInto |
RenameKeyGroup |
KeyParamGroup |
KeyGroup |
ParamGroup |
Counts |
Dim1Size |
Dim2Size |
Dim3Size |
EchoTime |
EffectiveEchoSpacing |
FlipAngle |
HasFieldmap |
KeyGroupCount |
Modality |
NSliceTimes |
NumVolumes |
Obliquity |
ParallelReductionFactorInPlane |
PartialFourier |
PhaseEncodingDirection |
RepetitionTime |
TotalReadoutTime |
UsedAsFieldmap |
VoxelSizeDim1 |
VoxelSizeDim2 |
VoxelSizeDim3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
datatype-dwi_run-1_suffix-dwi__1 |
datatype-dwi_run-1_suffix-dwi |
1 |
1388 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
TRUE |
1388 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
8.1 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
||||
acquisition-VARIANTNoFmap_datatype-dwi_run-1_suffix-dwi__1 |
acquisition-VARIANTNoFmap_datatype-dwi_run-1_suffix-dwi |
1 |
25 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
FALSE |
25 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
8.1 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
||||
acquisition-VARIANTRepetitionTime_datatype-dwi_run-1_suffix-dwi__1 |
acquisition-VARIANTRepetitionTime_datatype-dwi_run-1_suffix-dwi |
1 |
6 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
TRUE |
9 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
9.0 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
||||
acquisition-VARIANTRepetitionTime_datatype-dwi_run-1_suffix-dwi__2 |
acquisition-VARIANTRepetitionTime_datatype-dwi_run-1_suffix-dwi |
2 |
3 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
TRUE |
9 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
9.8 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
||||
acquisition-VARIANTDim3SizeVoxelSizeDim3_datatype-dwi_run-1_suffix-dwi__1 |
acquisition-VARIANTDim3SizeVoxelSizeDim3_datatype-dwi_run-1_suffix-dwi |
1 |
2 |
128 |
128 |
46 |
0.082 |
0.000267 |
90 |
TRUE |
2 |
dwi |
46 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
8.1 |
0.034 |
FALSE |
1.875 |
1.875 |
3.0 |
||||
acquisition-VARIANTEchoTimeEffectiveEchoSpacingRepetitionTimeTotalReadoutTime_datatype-dwi_run-1_suffix-dwi__1 |
acquisition-VARIANTEchoTimeEffectiveEchoSpacingRepetitionTimeTotalReadoutTime_datatype-dwi_run-1_suffix-dwi |
1 |
1 |
128 |
128 |
70 |
0.102 |
0.0008 |
90 |
TRUE |
1 |
dwi |
70 |
35.0 |
FALSE |
3.0 |
0.75 |
j- |
12.3 |
0.102 |
FALSE |
1.875 |
1.875 |
2.0 |
||||
acquisition-VARIANTObliquity_datatype-dwi_run-1_suffix-dwi__1 |
acquisition-VARIANTObliquity_datatype-dwi_run-1_suffix-dwi |
1 |
1 |
128 |
128 |
70 |
0.082 |
0.000267 |
90 |
TRUE |
1 |
dwi |
70 |
35.0 |
TRUE |
3.0 |
0.75 |
j- |
8.1 |
0.034 |
FALSE |
1.875 |
1.875 |
2.0 |
Command line tools
With that brief introduction done, we can introduce the full gamut
of CuBIDS
command line tools:
Customizable configuration
CuBIDS
also features an optional, customizable, MRI image type-specific configuration file.
This file can be passed as an argument to cubids-group and cubids-apply using the –-config
flag
and allows users to customize grouping settings based on MRI image type and parameter. Each Key Group
is associated with one (and only one) MRI image type, as BIDS filenames include MRI image type-specific values
as their suffixes. This easy-to-modify configuration file provides several benefits to curation.
First, it allows users to add and remove metadata parameters from the set that determines groupings.
This can be very useful if a user deems a specific metadata parameter irrelevant and wishes to collapse
variation based on that parameter into a single Parameter Group. Second, the configuration file allows
users to apply tolerances for parameters with numerical values. This functionality allows users to avoid
very small differences in scanning parameters (i.e., a TR of 3.0s vs 3.0001s) being split into different
Parameter Groups
. Third, the configuration file allows users to determine which scanning parameters
are listed in the acquisition field when auto-renaming is applied to Variant Groups
.
Exemplar testing
In addition to facilitating curation of large, heterogeneous BIDS datasets, CuBIDS
also prepares
datasets for testing BIDS Apps. This portion of the CuBIDS
workflow relies on the concept of the
Acquisition Group: a set of sessions that have identical scan types and metadata across all imaging
modalities present in the session set. Specifically, cubids-copy-exemplars
copies one subject from each
Acquisition Group into a separate directory, which we call an Exemplar Dataset
. Since the Exemplar Dataset
contains one randomly selected subject from each unique Acquisition Group in the dataset, it will be a
valid BIDS dataset that spans the entire metadata parameter space of the full study. If users run
cubids-copy-exemplars
with the –-use-datalad
flag, the program will ensure that the Exemplar Dataset
is tracked and saved in DataLad
. If the user chooses to forgo this flag, the Exemplar Dataset
will be a standard directory located on the filesystem. Once the Exemplar Dataset
has been created,
a user can test it with a BIDS App (e.g., fMRIPrep or QSIPrep) to ensure that each unique set of scanning
parameters will pass through the pipelines successfully. Because BIDS Apps auto-configure workflows based
on the metadata encountered, they will process all scans in each Acquisition Group
in the same way. By
first verifying that BIDS Apps perform as intended on the small sub-sample of participants present in the
Exemplar Dataset
(that spans the full variation of the metadata), users can confidently move forward
processing the data of the complete BIDS dataset.
In the next section, we’ll introduce DataLad
and walk through a real example.
Footnotes