Step 2: signal-level QC
Having completed Step 1, the harm1.lst
sample list should now describe a dataset that has
consistent channel and annotation labels, consistent annotation
formats (standard .annot
) and nominally consistent units, sample rates and
referencing. We can now move on to look at the contents of the EDFs:
-
to describe and handle recordings with gaps
-
to identify likely flat or duplicate signals
-
to generate some basic signal summary statistics
-
to spot and fix issues with EEG polarity
Before embarking on these steps, let's check that the harmonization
steps have run as expected, by re-running the HEADERS
and ANNOTS
commands (note: we put these in single-quotes after -s
to stop the
shell from interpreting &
as a flag to run a job in the background):
luna harm1.lst -o out.db -s ' HEADERS & ANNOTS '
To view the output from destrat
, we'll combine awk
(to skip the
first row, and print only the second column) with the usual sort |
uniq -c
step to count the combinations of channels, units and sample
rates:
destrat out.db +HEADERS -r CH -v SR PDIM | awk ' NR != 1 { print $2 } ' | sort | uniq -c
20 AF3 uV 128
20 AF4 uV 128
20 AFZ uV 128
20 C1 uV 128
20 C2 uV 128
20 C3 uV 128
20 C4 uV 128
20 C5 uV 128
20 C6 uV 128
20 CP1 uV 128
20 CP2 uV 128
20 CP3 uV 128
20 CP4 uV 128
20 CP5 uV 128
20 CP6 uV 128
20 CPZ uV 128
19 CZ uV 128
20 F1 uV 128
20 F2 uV 128
20 F3 uV 128
20 F4 uV 128
20 F5 uV 128
20 F6 uV 128
20 F7 uV 128
20 F8 uV 128
20 FC1 uV 128
20 FC2 uV 128
20 FC3 uV 128
20 FC4 uV 128
20 FC5 uV 128
20 FC6 uV 128
20 FCZ uV 128
20 FPZ uV 128
20 FT7 uV 128
20 FT8 uV 128
20 FZ uV 128
20 Fp1 uV 128
20 Fp2 uV 128
20 O1 uV 128
20 O2 uV 128
20 OZ uV 128
20 P1 uV 128
20 P2 uV 128
20 P3 uV 128
20 P4 uV 128
20 P5 uV 128
20 P6 uV 128
20 P7 uV 128
20 P8 uV 128
20 PO3 uV 128
20 PO4 uV 128
20 POz uV 128
20 PZ uV 128
20 T7 uV 128
20 T8 uV 128
20 TP7 uV 128
20 TP8 uV 128
We see that with one exception (CZ), all channels are seen in all individuals, with the same set of labels applied; CZ appears only in 19 of 20.
For annotations, we can confirm that all labels have been mapped:
destrat out.db +ANNOTS -r ANNOT | awk ' NR != 1 { print $2 } ' | sort | uniq -c
10 Arousal
19 N1
20 N2
19 N3
19 R
19 W
That is, we see six distinct annotation labels; only half the sample
has Arousal
annotations marked, but that reflects the original v2
data, rather than an issue with the steps we've run.
We can now move to the next step, looking at the gapped EDF+ files.