NREM Periods

An electroencephalography (EEG) is typically staged in 30 second epochs. A NREM period is defined as any sequence of epochs satisfying the following conditions:

  • the sequence contains only NREM stages (stages $2$, $3$ or $4$);
  • the total amount of NREM sleep in the sequence is $15$ minutes or more;
  • the sequence is ended by either $a.$ 5 minutes or more of REM or $b.$ $5$ minutes or more of wakefulness.

Importantly, when parsing for a NREM period, the sequence may contain stages that aren't NREM stages. But these should be ignored unless they last for five minutes or more---i.e. unless they qualify as potential ending sequences for a NREM period. (If curious about where this definition comes from, see Feinberg & Floyed or study Dijk's papers.)

For example, assume we find 5 minutes of stage 2, followed by 3 minutes of REM, followed by 10 minutes of stage 3, followed by 10 minutes of REM or wake. Then the epochs in the 5 minutes of stage 2 and the 10 minutes of stage 3 must be marked as comprising a NREM period, ignoring the three minutes of REM in between.


Algorithm

I here provide an algorithm to automatically detect NREM periods. The algorithm must take a sequence of stages $s_1, \ldots, s_n$, where $s_i$ the sleep stage of the $i$th epoch, and output a sequence of $k$ vectors $\varrho_1, \ldots, \varrho_k$. Each $\varrho_i = (\varrho_i^1, \ldots, \varrho_i^z)$ must be s.t. $\varrho_{i}^{j}$ is the $j$th epoch in the $i$th NREM period.

From the specification above, it follows that the algorithm finds the number of NREM periods (i.e. $k$) and the epochs which comprise them. Thus, it provides a complete characterization of NREM periods during a night of sleep.

Formalization

Let $n$ be the number of epochs corresponding to $15$ minutes and $m$ the number of epochs corresponding to $5$ minutes. (In 30 second epochs, $n = 30, m = 10$).

Let $\Sigma =\{1, \ldots, 6, ?\}$, with $5$ marking REM, $6$ wakefulness, $?$ unknown/unstaged. The algorithm works by mapping $\vec{s}$ to $\alpha = s_1 \ldots s_q$ a word over $\Sigma^*$.

Now, $[(5+6)^(2+3+4)^]^* \subseteq \Sigma^{*}$ is partitioned into $U$ and $U’$, where $U$ is the set of words containing at least $n$ symbols $2, 3, 4$ where neither $5$ nor $6$ occur consecutively $m$ times. Then $\alpha$ can be decomposed into

$$\alpha = \psi_1 \phi_1 \psi_2 \phi_2 \ldots \psi_k \phi_k \psi_{k+1}$$

where $\phi_i = \varphi_i (5^m5^* + 6^m6^*)$ and $\varphi_i \in U$. Such a decomposition readily provides the number of NREM periods in the EEG (i.e. $k$). Furthermore, the epochs which comprise these periods are easily inferable from the decomposition.

Implementation

In order to obtain the decomposition

$$\alpha = \psi_1 \phi_1 \psi_2 \phi_2 \ldots \psi_k \phi_k \psi_{k+1}$$

some intermediate steps are necessary. In particular, $s_1 \ldots s_q \in \Sigma^*$ is mapped into a simpler alphabet $\Sigma' = \{X, I, 5, 6\}$, where $I$ are symbols that can be safely ingnored during parsing. The parts of the word that must not be ignored are deduced from the definition of NREM period. Then, the regular expression (in Julia)

ζ = Regex("(?:[X](I*)){$n,}(?:[5](I*){$m,}|6{$m,}|\$)")

provides the $\alpha$ decomposition given above by matching every $\phi_j$.

""" 
This functions separates repeating characters in a string into different words.
Example: '1122144434444556' ↦ '11 22 1 444 3 44444 55 6'

This is necessary because repeating REM or wake of certain length mark the 
end of NREM periods. Thus, from this separation we can infer which parts of 
the word are functionally releant and which can be ignored.
"""
function insert_spaces_on_change(str)
    # Initialize an empty string to build the modified version
    modified_str = ""

    # Iterate through the characters of the input string
    for (index, char) in enumerate(str)
        # Append the current character to the modified string
        modified_str *= char

        # Check if we're at the last character or the next character is different
        if index < length(str) && str[index] != str[index+1]
            # Append a space to separate different character sequences
            modified_str *= " "
        end
    end

    return modified_str
end

""" 
Given a stage vector, separates it using `insert_spaces_on_change` and maps 
it to a simpler word into the alphabet Σ = {I, X, 5, 6}.
"""
function stage_to_word(stages)

    # Given a stages s₁, ..., sₙ, produces a sentence s.t. each word 
    # in a sentence corresponds to segments of the staging with repeating 
    # stages.
    stages = replace(stages, "1" => "I", "?" => "I", "2" => "X", "3" => "X", "4" => "X")
    α = insert_spaces_on_change(join(stages, ""))

    # A function to transform the words of α, effectively mapping α 
    # onto a language more suited for regexes.
    function transform(w)
        if occursin('I', w) || occursin("X", w)
            return w
        end
        # If previous condition was not met, we have either `5` or `6`.
        if length(w) >= 10
            return (w)
        end
        repeat("I", length(w))
    end

    trans = transform.(split(α))
    return join(trans, "")
end

"""
NREM period detection.
"""
function nrem(staging::Vector, n::Integer=30, m::Integer=10)

    if Set(unique(staging))  Set(["1", "2", "3", "4", "5", "6", "?"])
        @error "Invalid stage score: Sleep stages should be 1, 2, 3, 4, 5, 6, ?, 
        where 5 marks REM, 6 wakefulness, and ? unknown/unscored."
    end

    α = stage_to_word(staging)
    ζ = Regex("(?:[X](I*)){$n,}(?:[5](I*){$m,}|6{$m,}|\$)")
 
    locs = []

    matches = eachmatch(ζ, α)

    for match in matches
        begins = s + match.offset - 1
        loc = [begins, begins + length(match.match)]
        insert!(locs, length(locs), loc)
    end

    # Collect all matches into vectors
    locs = [collect(x[1]:x[2]) for x in locs]

    # Remove ending REM/Wake stages as well as in-between REM, Wake, and Stage
    # 1 epochs.
    for nrem in locs
        indx = findall(x -> staging[x] != "2" && staging[x] != "3" && staging[x] != "4", nrem)
        deleteat!(nrem, indx)
    end

    return locs
end

A slightly more sophisticated version of this algorithm was implemented in my EEGToolkit Julia package.