NREM Periods
An electroencephalography (EEG) is typically staged in 30 second epochs. A NREM period is defined as any sequence of epochs satisfying the following conditions:
- the sequence contains only NREM stages (stages $2$, $3$ or $4$);
- the total amount of NREM sleep in the sequence is $15$ minutes or more;
- the sequence is ended by either $a.$ 5 minutes or more of REM or $b.$ $5$ minutes or more of wakefulness.
Importantly, when parsing for a NREM period, the sequence may contain stages that aren't NREM stages. But these should be ignored unless they last for five minutes or more---i.e. unless they qualify as potential ending sequences for a NREM period. (If curious about where this definition comes from, see Feinberg & Floyed or study Dijk's papers.)
For example, assume we find 5 minutes of stage 2, followed by 3 minutes of REM, followed by 10 minutes of stage 3, followed by 10 minutes of REM or wake. Then the epochs in the 5 minutes of stage 2 and the 10 minutes of stage 3 must be marked as comprising a NREM period, ignoring the three minutes of REM in between.
Algorithm
I here provide an algorithm to automatically detect NREM periods. The algorithm must take a sequence of stages $s_1, \ldots, s_n$, where $s_i$ the sleep stage of the $i$th epoch, and output a sequence of $k$ vectors $\varrho_1, \ldots, \varrho_k$. Each $\varrho_i = (\varrho_i^1, \ldots, \varrho_i^z)$ must be s.t. $\varrho_{i}^{j}$ is the $j$th epoch in the $i$th NREM period.
From the specification above, it follows that the algorithm finds the number of NREM periods (i.e. $k$) and the epochs which comprise them. Thus, it provides a complete characterization of NREM periods during a night of sleep.
Formalization
Let $n$ be the number of epochs corresponding to $15$ minutes and $m$ the number of epochs corresponding to $5$ minutes. (In 30 second epochs, $n = 30, m = 10$).
Let $\Sigma =\{1, \ldots, 6, ?\}$, with $5$ marking REM, $6$ wakefulness, $?$ unknown/unstaged. The algorithm works by mapping $\vec{s}$ to $\alpha = s_1 \ldots s_q$ a word over $\Sigma^*$.
Now, $[(5+6)^(2+3+4)^]^* \subseteq \Sigma^{*}$ is partitioned into $U$ and $U’$, where $U$ is the set of words containing at least $n$ symbols $2, 3, 4$ where neither $5$ nor $6$ occur consecutively $m$ times. Then $\alpha$ can be decomposed into
$$\alpha = \psi_1 \phi_1 \psi_2 \phi_2 \ldots \psi_k \phi_k \psi_{k+1}$$
where $\phi_i = \varphi_i (5^m5^* + 6^m6^*)$ and $\varphi_i \in U$. Such a decomposition readily provides the number of NREM periods in the EEG (i.e. $k$). Furthermore, the epochs which comprise these periods are easily inferable from the decomposition.
Implementation
In order to obtain the decomposition
$$\alpha = \psi_1 \phi_1 \psi_2 \phi_2 \ldots \psi_k \phi_k \psi_{k+1}$$
some intermediate steps are necessary. In particular, $s_1 \ldots s_q \in \Sigma^*$ is mapped into a simpler alphabet $\Sigma' = \{X, I, 5, 6\}$, where $I$ are symbols that can be safely ingnored during parsing. The parts of the word that must not be ignored are deduced from the definition of NREM period. Then, the regular expression (in Julia)
ζ = Regex("(?:[X](I*)){$n,}(?:[5](I*){$m,}|6{$m,}|\$)")
provides the $\alpha$ decomposition given above by matching every $\phi_j$.
"""
This functions separates repeating characters in a string into different words.
Example: '1122144434444556' ↦ '11 22 1 444 3 44444 55 6'
This is necessary because repeating REM or wake of certain length mark the
end of NREM periods. Thus, from this separation we can infer which parts of
the word are functionally releant and which can be ignored.
"""
function insert_spaces_on_change(str)
# Initialize an empty string to build the modified version
modified_str = ""
# Iterate through the characters of the input string
for (index, char) in enumerate(str)
# Append the current character to the modified string
modified_str *= char
# Check if we're at the last character or the next character is different
if index < length(str) && str[index] != str[index+1]
# Append a space to separate different character sequences
modified_str *= " "
end
end
return modified_str
end
"""
Given a stage vector, separates it using `insert_spaces_on_change` and maps
it to a simpler word into the alphabet Σ = {I, X, 5, 6}.
"""
function stage_to_word(stages)
# Given a stages s₁, ..., sₙ, produces a sentence s.t. each word
# in a sentence corresponds to segments of the staging with repeating
# stages.
stages = replace(stages, "1" => "I", "?" => "I", "2" => "X", "3" => "X", "4" => "X")
α = insert_spaces_on_change(join(stages, ""))
# A function to transform the words of α, effectively mapping α
# onto a language more suited for regexes.
function transform(w)
if occursin('I', w) || occursin("X", w)
return w
end
# If previous condition was not met, we have either `5` or `6`.
if length(w) >= 10
return (w)
end
repeat("I", length(w))
end
trans = transform.(split(α))
return join(trans, "")
end
"""
NREM period detection.
"""
function nrem(staging::Vector, n::Integer=30, m::Integer=10)
if Set(unique(staging)) ⊈ Set(["1", "2", "3", "4", "5", "6", "?"])
@error "Invalid stage score: Sleep stages should be 1, 2, 3, 4, 5, 6, ?,
where 5 marks REM, 6 wakefulness, and ? unknown/unscored."
end
α = stage_to_word(staging)
ζ = Regex("(?:[X](I*)){$n,}(?:[5](I*){$m,}|6{$m,}|\$)")
locs = []
matches = eachmatch(ζ, α)
for match in matches
begins = s + match.offset - 1
loc = [begins, begins + length(match.match)]
insert!(locs, length(locs), loc)
end
# Collect all matches into vectors
locs = [collect(x[1]:x[2]) for x in locs]
# Remove ending REM/Wake stages as well as in-between REM, Wake, and Stage
# 1 epochs.
for nrem in locs
indx = findall(x -> staging[x] != "2" && staging[x] != "3" && staging[x] != "4", nrem)
deleteat!(nrem, indx)
end
return locs
end
A slightly more sophisticated version of this algorithm was implemented in my EEGToolkit Julia package.