---
title: "Getting Started with PhysioExperiment"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting Started with PhysioExperiment}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE
)
```

## 1. Overview

**PhysioExperiment** is a Bioconductor-compatible R package for analyzing
multi-modal physiological signal data. It provides a unified data model built on
top of `SummarizedExperiment` that can store EEG, EMG, ECG, IMU, MoCap, and
other time-series signals alongside rich channel and event metadata.

Key features include:

- **Unified data model** -- the `PhysioExperiment` class wraps signal data,
  channel metadata, event markers, and sampling rate into a single object.
- **File I/O** -- read and write EDF/BDF, BrainVision, GDF, HDF5, BIDS, CSV,
  and MATLAB formats.
- **Signal processing** -- Butterworth/FIR/notch filtering, re-referencing,
  resampling, artifact detection, and epoching.
- **Time-frequency analysis** -- STFT spectrograms, Morlet wavelet transforms,
  and band power extraction.
- **Visualization** -- publication-quality plots for signals, multi-channel
  displays, PSD, ERPs, topographic maps, and spectrograms.
- **Statistical testing** -- pointwise t-tests, ANOVA, cluster-based
  permutation tests, effect sizes, and multiple comparison correction.
- **Database integration** -- DuckDB backend for efficient querying of large
  datasets.
- **GUI** -- a modern React-based graphical interface accessible from R.

## 2. Installation

Install PhysioExperiment from GitHub:

```{r install}
# Install from GitHub
if (!requireNamespace("remotes", quietly = TRUE))
  install.packages("remotes")
remotes::install_github("matsui-lab/PhysioExperiment")
```

Since PhysioExperiment depends on Bioconductor packages
(`SummarizedExperiment`, `S4Vectors`, `HDF5Array`), you may also need to
install the Bioconductor infrastructure:

```{r bioc-install}
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install(c("SummarizedExperiment", "S4Vectors", "HDF5Array"))
```

## 3. Creating a PhysioExperiment

### 3.1 From scratch

The constructor takes an assay matrix (rows = time points, columns = channels),
channel metadata, and a sampling rate:

```{r create-basic}
library(PhysioExperiment)

# Simulate 4 seconds of 4-channel EEG at 250 Hz
set.seed(42)
n_time <- 1000
n_channels <- 4
sr <- 250

eeg_data <- matrix(rnorm(n_time * n_channels), nrow = n_time, ncol = n_channels)
colnames(eeg_data) <- c("Fz", "Cz", "Pz", "Oz")

pe <- PhysioExperiment(
  assays = list(raw = eeg_data),
  colData = S4Vectors::DataFrame(
    label = c("Fz", "Cz", "Pz", "Oz"),
    type  = rep("EEG", 4),
    unit  = rep("uV", 4)
  ),
  samplingRate = sr
)

pe
```

You can also supply `rowData` (per-time-point metadata) and a `metadata` list
for experiment-level information such as reference electrode or recording date.

### 3.2 From a CSV file

PhysioExperiment ships with a small sample EEG dataset. Load it with `readCSV()`:

```{r create-csv}
# Locate the bundled sample file
csv_path <- system.file("extdata", "sample_eeg.csv", package = "PhysioExperiment")

# Read the wide-format CSV (time column + channel columns)
pe <- readCSV(csv_path, time_col = "time", sampling_rate = 250)
pe
```

Other I/O functions include `readEDF()`, `readBrainVision()`, `readGDF()`,
`readHDF5()`, `readBIDS()`, and `readMATLAB()`.

### 3.3 Adding events

Events (triggers, stimuli, responses) are stored as `PhysioEvents` objects:

```{r add-events}
# Create stimulus events at 0.5 s intervals
events <- PhysioEvents(
  onset    = c(0.5, 1.0, 1.5),
  duration = c(0, 0, 0),
  type     = c("stimulus", "stimulus", "stimulus"),
  value    = c("target", "distractor", "target")
)

pe <- setEvents(pe, events)
getEvents(pe)
```

## 4. Exploring the Object

### 4.1 Printing and dimensions

Printing a `PhysioExperiment` gives a concise overview:

```{r explore-print}
pe

# Dimensions: time points x channels
dim(pe)

# Signal duration in seconds
duration(pe)
```

### 4.2 Accessing the sampling rate

```{r explore-sr}
samplingRate(pe)

# Change the sampling rate (metadata only -- does not resample the data)
samplingRate(pe) <- 500
samplingRate(pe)
samplingRate(pe) <- 250  # reset
```

### 4.3 Channel information

```{r explore-channels}
# Channel metadata table
channelInfo(pe)

# Channel labels
channelNames(pe)

# Number of channels
nChannels(pe)

# Per-channel summary statistics
summary(pe)
```

### 4.4 Assay access

Signal data is stored as *assays*. Each processing step typically adds a new
assay so you can always go back to the original data:

```{r explore-assays}
# List available assays
SummarizedExperiment::assayNames(pe)

# Access the raw data matrix (time x channels)
raw_data <- SummarizedExperiment::assay(pe, "raw")
dim(raw_data)
head(raw_data[, 1:2])
```

### 4.5 Subsetting

PhysioExperiment supports `[i, j]` subsetting where `i` selects time points and
`j` selects channels:

```{r explore-subset}
# First 500 time points, channels 1-2
pe_sub <- pe[1:500, 1:2]
dim(pe_sub)

# Extract a time window in seconds
pe_window <- extractWindow(pe, tmin = 0.5, tmax = 1.5)
dim(pe_window)

# Pick channels by name
pe_frontal <- pickChannels(pe, c("Fz", "Cz"))
channelNames(pe_frontal)

# Drop channels
pe_no_oz <- dropChannels(pe, "Oz")
channelNames(pe_no_oz)
```

## 5. Signal Processing

### 5.1 Butterworth filtering

`butterworthFilter()` applies a zero-phase IIR filter (via `signal::filtfilt`)
and stores the result in a new assay:

```{r filter-butter}
# Bandpass filter: keep 1-40 Hz (common for EEG)
pe <- butterworthFilter(pe, low = 1, high = 40, type = "pass")

# Lowpass filter at 30 Hz
pe <- butterworthFilter(pe, high = 30, type = "low", output_assay = "lowpass")

# Highpass filter at 0.5 Hz (remove DC drift)
pe <- butterworthFilter(pe, low = 0.5, type = "high", output_assay = "highpass")

# Check the new assays
SummarizedExperiment::assayNames(pe)
```

### 5.2 Notch filter

Remove power-line interference at 50 Hz (or 60 Hz) and its harmonics:

```{r filter-notch}
pe <- notchFilter(pe, freq = 50, harmonics = 2)
```

### 5.3 FIR filter

For linear-phase filtering, use `firFilter()`:

```{r filter-fir}
pe <- firFilter(pe, low = 1, high = 40, order = 100, type = "pass",
                output_assay = "fir_filtered")
```

### 5.4 Moving-average filter

A simple smoothing filter:

```{r filter-ma}
pe <- filterSignals(pe, window = 5, output_assay = "smoothed")
```

### 5.5 Detrending

Remove linear trends or the mean from each channel:

```{r detrend}
pe <- detrendSignal(pe, type = "linear", output_assay = "detrended")
```

### 5.6 Re-referencing

Re-referencing changes the reference electrode for EEG data. Three modes are
supported:

```{r rereference}
# Average reference (common for high-density EEG)
pe_avg <- rereference(pe, ref_type = "average")
isAverageReferenced(pe_avg)  # TRUE

# Single-channel reference (e.g., Cz)
pe_cz <- rereference(pe, ref_type = "channel", ref_channels = "Cz")
getCurrentReference(pe_cz)  # "Cz"

# Linked-mastoids reference (average of two channels)
# pe_linked <- rereference(pe, ref_type = "channels",
#                          ref_channels = c("M1", "M2"))
```

## 6. Epoching and Averaging

Epoching segments continuous data into time-locked trials around events.

### 6.1 Creating epochs

```{r epoch-basic}
# Ensure events are present
pe <- setEvents(pe, PhysioEvents(
  onset = c(0.5, 1.0, 1.5),
  type  = "stimulus",
  value = c("target", "distractor", "target")
))

# Epoch: 200 ms before to 800 ms after each stimulus
pe_epochs <- epochData(pe, tmin = -0.2, tmax = 0.8, event_type = "stimulus")
pe_epochs

# The epoched data is now 4D: time x channel x epoch x sample
dim(SummarizedExperiment::assay(pe_epochs, "epoched"))
```

### 6.2 Baseline correction and artifact rejection

```{r epoch-advanced}
# Epoch with baseline correction (-200 to 0 ms)
pe_epochs_bl <- epochData(pe, tmin = -0.2, tmax = 0.8,
                           baseline = c(-0.2, 0))

# Epoch with artifact rejection (reject epochs with amplitude > 100 uV)
pe_epochs_clean <- epochData(pe, tmin = -0.2, tmax = 0.8,
                              baseline = c(-0.2, 0),
                              reject = 100)
```

### 6.3 Averaging epochs

Compute the event-related potential (ERP) by averaging across epochs:

```{r epoch-average}
# Average all epochs
pe_erp <- averageEpochs(pe_epochs)

# Average by condition (group by event_type column in epoch_info)
pe_erp_cond <- averageEpochs(pe_epochs, by = "event_type")

# Grand average across multiple subjects
# pe_grand <- grandAverage(pe_erp_subj1, pe_erp_subj2, pe_erp_subj3)
```

## 7. Time-Frequency Analysis

### 7.1 Spectrogram (STFT)

```{r timefreq-stft}
# Compute spectrogram for channel 1
spec <- spectrogram(pe, channel = 1, window_size = 128, overlap = 0.75)

# The result is a list with power, frequencies, and times
names(spec)
dim(spec$power)  # frequency x time
range(spec$frequencies)

# Plot the spectrogram
plotSpectrogram(spec, freq_range = c(1, 50))
```

### 7.2 Wavelet transform

Morlet wavelet decomposition gives better frequency resolution at low
frequencies and better time resolution at high frequencies:

```{r timefreq-wavelet}
# Wavelet transform from 1 to 40 Hz
wt <- waveletTransform(pe, frequencies = seq(1, 40), n_cycles = 7, channel = 1)

# Access power and phase matrices (frequency x time)
dim(wt$power)
dim(wt$phase)
```

### 7.3 Band power

Extract power in standard EEG frequency bands:

```{r timefreq-bandpower}
# Default bands: delta, theta, alpha, beta, gamma
bp <- bandPower(pe)
bp

# Relative band power (proportions summing to 1)
bp_rel <- bandPower(pe, relative = TRUE)
bp_rel

# Custom frequency bands
custom_bands <- list(
  low_alpha  = c(8, 10),
  high_alpha = c(10, 13),
  low_beta   = c(13, 20),
  high_beta  = c(20, 30)
)
bp_custom <- bandPower(pe, bands = custom_bands)
bp_custom
```

### 7.4 Hilbert transform

Extract instantaneous amplitude (envelope) and phase:

```{r timefreq-hilbert}
# Compute the analytic signal
pe <- hilbertTransform(pe, output_assay = "analytic")

# Extract amplitude envelope
pe <- instantaneousAmplitude(pe)

# Extract instantaneous phase
pe <- instantaneousPhase(pe)

SummarizedExperiment::assayNames(pe)
```

### 7.5 FFT

Compute the magnitude spectrum for all channels:

```{r timefreq-fft}
pe <- fftSignals(pe)
SummarizedExperiment::assayNames(pe)  # now includes "fft"
```

## 8. Visualization

PhysioExperiment provides several `ggplot2`-based plotting functions.

### 8.1 Single-channel trace

```{r vis-signal}
# Plot channel 1 from the default (first) assay
plotSignal(pe, channel = 1)

# Plot a specific assay
plotSignal(pe, channel = 2, assay_name = "filtered")
```

### 8.2 Multi-channel display

```{r vis-multi}
# Butterfly plot (all channels overlaid, colour-coded)
plotMultiChannel(pe, style = "butterfly")

# Stacked plot (channels offset vertically for readability)
plotMultiChannel(pe, style = "stacked")

# Select specific channels
plotMultiChannel(pe, channels = c(1, 3), style = "butterfly")
```

### 8.3 Power spectral density

```{r vis-psd}
# PSD of all channels (up to 5 by default)
plotPSD(pe)

# Restrict frequency range and select channels
plotPSD(pe, channels = c(1, 4), freq_range = c(1, 50))

# Linear scale (default is log)
plotPSD(pe, log_scale = FALSE, freq_range = c(0, 60))
```

### 8.4 Event-related potential (ERP)

Plot the average waveform across epochs with a confidence interval:

```{r vis-erp}
# Basic ERP with 95% CI shading
plotERP(pe_epochs, channel = 1)

# Show individual epoch traces behind the average
plotERP(pe_epochs, channel = 1, show_epochs = TRUE, epoch_alpha = 0.15)

# No confidence interval
plotERP(pe_epochs, channel = 1, ci = NULL)
```

### 8.5 Topographic map

Scalp topography requires electrode positions. Use `applyMontage()` to assign
standard 10-20 positions:

```{r vis-topo}
# Assign electrode positions from the 10-20 system
pe <- applyMontage(pe, "10-20")

# Plot topographic map at a specific time point
plotTopomap(pe, time = 0.5)

# Plot with custom values (e.g., band power or t-statistics)
plotTopomap(pe, values = c(1.2, 0.5, -0.8, -1.5))

# Series of topomaps across time
plots <- plotTopomapSeries(pe, times = c(0.1, 0.2, 0.3, 0.4, 0.5))
```

### 8.6 Spectrogram

```{r vis-spectrogram}
spec <- spectrogram(pe, channel = 1)
plotSpectrogram(spec, freq_range = c(1, 40))
```

## 9. Statistical Testing

### 9.1 Pointwise t-test

Test whether the ERP is significantly different from zero (one-sample) or
between two conditions (two-sample) at every time point and channel:

```{r stats-ttest}
# One-sample t-test (all epochs against zero)
res_t <- tTestEpochs(pe_epochs)
names(res_t)
dim(res_t$t_values)   # time x channel
dim(res_t$p_values)   # time x channel

# Two-sample t-test between condition groups
res_t2 <- tTestEpochs(pe_epochs, condition1 = 1:5, condition2 = 6:10)

# Paired t-test
res_tp <- tTestEpochs(pe_epochs, condition1 = 1:5, condition2 = 6:10,
                       paired = TRUE)
```

### 9.2 ANOVA

One-way ANOVA across three or more conditions:

```{r stats-anova}
# Requires epoch_info metadata with a grouping column
# Example: epochs labeled by condition "A", "B", "C"
res_anova <- anovaEpochs(pe_epochs, groups = "event_type")
names(res_anova)
dim(res_anova$f_values)  # time x channel
```

### 9.3 Cluster-based permutation test

Cluster permutation testing controls for multiple comparisons by identifying
contiguous clusters of significant effects and computing cluster-level p-values
through permutation:

```{r stats-cluster}
# Compare two conditions with 1000 permutations
res_clust <- clusterPermutationTest(
  pe_epochs,
  condition1 = 1:5,
  condition2 = 6:10,
  n_permutations = 1000,
  cluster_threshold = 0.05,
  seed = 123
)

# Significant clusters
res_clust$cluster_p

# Logical mask of significant time-channel locations
sum(res_clust$cluster_mask)
```

### 9.4 Multiple comparison correction

Apply correction to any p-value matrix:

```{r stats-mcc}
# FDR (Benjamini-Hochberg) correction
p_fdr <- correctPValues(res_t$p_values, method = "fdr")

# Bonferroni correction
p_bonf <- correctPValues(res_t$p_values, method = "bonferroni")

# Holm correction
p_holm <- correctPValues(res_t$p_values, method = "holm")
```

### 9.5 Effect size and confidence intervals

```{r stats-effect}
# Cohen's d between conditions
d <- effectSize(pe_epochs, condition1 = 1:5, condition2 = 6:10)
range(d$d, na.rm = TRUE)

# Bootstrap confidence interval for the ERP
boot <- bootstrapCI(pe_epochs, n_bootstrap = 1000, ci_level = 0.95, seed = 42)
dim(boot$ci_lower)   # time x channel
dim(boot$ci_upper)
```

### 9.6 Finding significant time windows

Given a vector of p-values over time, identify contiguous intervals that reach
significance:

```{r stats-windows}
# Extract p-values for channel 1
p_ch1 <- res_t$p_values[, 1]

# Find windows where p < 0.05
windows <- findSignificantWindows(p_ch1, times = res_t$times, alpha = 0.05)
windows
```

## 10. Database Integration

PhysioExperiment uses DuckDB for efficient storage and querying of large
datasets.

```{r db}
# Connect to an in-memory database
con <- connectDatabase()

# Or connect to a file-based database for persistent storage
# con <- connectDatabase("my_experiments.duckdb")

# (Future: store and query PhysioExperiment objects via the database)

# Always disconnect when done
disconnectDatabase(con)
```

## 11. Graphical User Interface

PhysioExperiment includes a browser-based GUI built with React, providing
interactive signal exploration without writing R code.

```{r gui}
# Launch the GUI in your default browser
launchGUI()

# Start on a custom port
launchGUI(port = 3000)

# Start the API server in the background (non-blocking)
server <- startAPIServer()
# ... do other work ...
server$kill()

# Check GUI dependencies
checkGUIDependencies()
```

The GUI supports data import, interactive visualization, preprocessing
pipelines, time-frequency analysis, statistical testing, and a visual workflow
builder.

## Session Information

```{r session-info}
sessionInfo()
```