Luna walkthrough (v1.3 Mar-2025)

Goals

The goal of this walkthrough is to demonstrate a more real-world, soup-to-nuts usage of Luna than the existing documentation provides: namely to validate, manipulate, clean and then analyze a specific dataset. The main website is the canonical point of reference for Luna, including an introductory tutorial and a series of vignettes on specific topics. This demonstration is intended to more closely mirror the experience of a user wishing to apply Luna to his or her own data. In brief:

This demonstration is based on high-density EEG (hd-EEG) recordings on 20 control individuals selected from the larger GRINS study; these data are available via the National Sleep Research Resource and should be acquired prior to embarking on this walkthrough.
We've explicitly introduced a number of manipulations in these data that make them harder to work with, e.g. corrupting signals, truncating or misaligning staging, altering labels and other formats/standards.
We then show how Luna can detect and potentially correct some of these issues, before stepping through ten areas of analysis for sleep EEG recordings:
- macro-architecture (hypnogram-based) analysis
- time-frequency analysis including various approaches to spectral analysis
- ultradian dynamics
- functional connectivity metrics
- dimension reduction via principal components
- NREM transients, including NREM spindle and slow oscillation detection
- model-based biological age prediction
- efficient linear association models, with nonparametric methods for multiple testing correction
- peri-event statistics
- interval-based analysis, focusing on spindle propagation via empirical surrogate-distribution approaches

Some key points to make right off the bat:

This is not intended to be a primer on study design or the statistical analysis of sleep data. The primary goal of this walkthrough is to introduce a tool in a semi-realistic context: in no sense is it intended to be a model of optimal data analysis. We use a toy dataset and apply a broad range of operations to be able to cover a range of Luna functionality in a practical amount of time, not because we think this represents a sane analysis plan for a real study.
Although the data used here are high-density sleep EEG, many of the points are generally applicable to standard, limited montage PSG studies. Only a few sections (e.g. interpolation) truly require hd-EEG. In the future, we aim to extend this walkthrough to include other signal modalities. But even if your interests do not lie with the sleep EEG per se, this walkthrough may still be of value, e.g. with regard to basic manipulations of EDF files, handling sleep staging and annotations, etc.
This is not an R tutorial as there is plenty of material out there for using R. The snippets of R code presented do not necessarily reflect best practice for R programming either: they get the job done but feel free to improve on what is presented here.
There is a relative large amount of material here, spanning a range of topics. Although the key steps of the walkthrough are designed to be performed sequentially, the material is still somewhat modular. Some of the components can be skipped or studied in isolation. Although not advised, if you wish to skip steps 1 through 4 (data preparation and QC) and head straight to analysis (step 5), there are notes on how to catch up (i.e. cheat...). In any case, you can expect that this walkthrough will take more than one or two sittings, if you want to systematically follow things.
A faster pace: For reference, we also include a stream-lined partial run through of this walk-through: as the name suggests, this has a faster pace that skips many of the practical details and follows only the key steps to go from the original data to the primary association analyses (section 5.8).

Two platforms

The core walkthrough will be available via two platforms (note: currently only the command-line version is released):

command-line (shell script) usage of Luna, combined with R to summarize and visualize key results, as shown on this website
a Jupyter lab notebook (currently under development) that uses the Python-based lunapi interface to Luna to achieve the same results (nb. the notebooks are designed to be viewed in conjunction with these pages, i.e. to give context, etc, as not all material is duplicated)

We advise to first follow the command-line version. In fact, it is probably a good idea to do a quick first-pass skim of the walkthrough, to get the overall content and structure, before diving into running commands. The Python interface is better for interactive work and visualization. The command-line interface is better for project-based, reproducible science, although we acknowledge this may be a personal opinion.

Steps of the walkthrough

The walkthrough is designed to be traversed sequentially:

a description of the data
step 0: preparatory steps required to set up the walkthrough
step 1: file-level quality control (QC) to validate and consistently reformat input files
step 2: signal-level QC
step 3: staging/annotation QC
step 4: cleaning and interpolation to make an analysis-ready EEG dataset
step 5: analyses to quantify individual differences in the sleep EEG as described above

Over time, we'll be adding a series of more advanced/in-depth sections under the step 6: next steps page.

Now let's get started, by looking at the data using.