SageDIA Parameters

Full parameter reference for sage-dia. Run sage-dia --help for the latest options.

Input / Output

Positional: `<mzparquet>`

Input spectrum files in mzParquet format. Accepts multiple files and glob patterns.

`--mzbinary <FILES...>`

Pre-computed quant files from a previous --quant-only run. Used for two-phase workflows.

`--library <FILE>` (default: `""`)

Spectral library file. Supported formats: .tsv, .predicted.tsv, .parquet, .sagelib (fast binary cache).

A .sagelib cache is automatically created next to the library on first use, speeding up subsequent loads.

`--output <FILE>` (default: `r.tsv`)

Output file path for the main results TSV.

Search Parameters

`--scan-radius <INT>` (default: `6`)

Number of cycles around each candidate peak to extract for feature computation.

`--min-product-len <INT>` (default: `3`)

Minimum fragment ion length (number of amino acid residues).

`--min-charge <INT>` (default: `1`)

Minimum fragment ion charge state.

`--max-charge <INT>` (default: `255`)

Maximum fragment ion charge state. Set to 255 for no limit.

`--top-n-frags <INT>` (default: `12`)

Maximum number of fragment ions per precursor to use for scoring.

`--precursor-isotope <INT>` (default: `2`)

Number of precursor isotope channels (M0, M1, ...) to extract.

`--mass-ppm-tol <FLOAT>` (default: `0.0`)

Fixed mass tolerance in ppm. When 0.0, tolerance is determined automatically from calibration.

`--label <STRING>` (default: `""`)

SILAC/label definition. Format: <AminoAcid>:<MassDelta>.

`--custom-mod <STRING>` (default: `""`)

Custom modification substitution. Format: <Mod1>:<Mass1>,<Mod2>:<Mass2>.

Predictor / Scoring

`--predictor <MODE>` (default: `auto`)

Scoring model for discriminant analysis.

Value	Description
`auto`	Try both LDA and XGBoost, pick whichever gives more IDs (default)
`lda`	Linear Discriminant Analysis — faster
`xgboost`	Gradient boosting — more accurate on complex datasets

`--xgboost-iterations <INT>` (default: `5`)

Number of XGBoost training iterations (only used when predictor is xgboost or auto).

`--single-lda` (default: `false`)

Use a single LDA optimization iteration instead of the default 10. Faster but may reduce sensitivity.

`--peak-detection <MODE>` (default: `corr`)

Peak candidate detection method.

Value	Description
`corr`	Inter-fragment correlation local maxima (default, DIA-NN-like)
`sa`	NNLS × spectral angle local maxima only
`combined`	Both SA and fragment-correlation maxima

`--disable-r2-feature` (default: `false`)

Remove Gaussian R² from discriminant scoring features. May help on very noisy data where R² penalizes real peaks.

Quality Filters

`--min-points-per-peak <INT>` (default: `3`)

Minimum number of non-zero XIC data points for a peak to be reported.

`--min-points-per-peak-calib <INT>` (default: `1`)

Minimum XIC data points during calibration (more permissive to retain calibrants).

`--light-heavy-min-correlation <FLOAT>` (default: `0.5`)

Minimum Pearson correlation between light and heavy channel XICs for labeled searches.

Post-Search Boosting

These features rescue additional identifications after the initial FDR calculation by applying additional quality criteria.

`--gaussian-r2-boost` (default: `true`)

Rescue precursors with excellent chromatographic peak shape (high Gaussian R²) even if initial q-value > 0.01.

Sub-parameter	Default	Description
`--gaussian-r2-boost-threshold`	`0.7`	Minimum R² to qualify
`--gaussian-r2-boost-qvalue`	`0.05`	Maximum q-value to consider for rescue
`--gaussian-r2-boost-min-xic`	`3`	Minimum XIC data points required
`--gaussian-r2-boost-min-frags`	`3`	Minimum fragment ions at apex

Disable with: --gaussian-r2-boost false

`--protein-context-boost` (default: `true`)

Relax FDR threshold for peptides belonging to proteins already confidently identified.

Sub-parameter	Default	Description
`--protein-context-qvalue`	`0.05`	Relaxed q-value threshold for boosted peptides
`--protein-context-conf-qvalue`	`0.01`	Q-value threshold defining "confident" proteins
`--protein-context-r2-threshold`	`0.3`	R² threshold for initial confident protein identification

Disable with: --protein-context-boost false

`--filter-one-hit-wonders` (default: `true`)

Remove proteins supported by only a single low-quality peptide.

Sub-parameter	Default	Description
`--one-hit-r2-threshold`	`0.5`	Minimum R² for a one-hit wonder to survive
`--one-hit-min-frags`	`3`	Minimum fragments at apex for one-hit wonders

Disable with: --filter-one-hit-wonders false

Match-Between-Runs (MBR)

MBR transfers confident identifications across runs to fill missing values, improving data completeness in multi-file experiments. MBR items are marked in the output and do not participate in FDR calculation.

`--mbr` (default: `true`)

Enable match-between-runs. Only effective when processing multiple files.

Disable with: --mbr false

Sub-parameter	Default	Description
`--mbr-rt-window`	`1.0`	RT window in minutes for MBR peak matching
`--mbr-score-threshold`	`0.3`	Score percentile cutoff (0.3 = keep top 70%)
`--mbr-min-r2`	`0.7`	Minimum Gaussian R² for MBR transfers
`--mbr-min-frags`	`2`	Minimum fragments at apex
`--mbr-min-xic-points`	`3`	Minimum XIC data points

Interference Removal

`--remove-interference` (default: `false`)

Remove precursors whose fragment ions overlap with a higher-scoring precursor in the same DIA window.

Sub-parameter	Default	Description
`--interference-ppm`	`20.0`	Mass tolerance for fragment overlap detection
`--interference-min-overlap`	`0.5`	Minimum fraction of overlapping fragments

RT Calibration

`--rt-calibration-method <METHOD>` (default: `lowess`)

Value	Description
`lowess`	Non-parametric LOWESS regression (default). May flatten at RT edges.
`linear`	Simple linear regression (DIA-NN style). More robust at extremes.

`--calibration-correlation <MODE>` (default: `L`)

Which channels to use for RT calibration in labeled experiments.

Value	Description
`L`	Light fragments only (default)
`H`	Heavy fragments only
`LH`	Both light and heavy fragments

Gaussian R² Settings

`--gaussian-r2-smooth` (default: `true`)

Smooth XICs before computing Gaussian R² fit.

`--gaussian-r2-centroid` (default: `true`)

Use intensity-weighted centroid (instead of apex) as the center for Gaussian fitting.

XIC Output

`--gen-xic` (default: `false`)

Generate XIC trace files alongside main results. Produces both .xic.tsv and .xic.db (SQLite).

`--xic-rt-tolerance <FLOAT>` (default: `1.0`)

Extra RT padding (in minutes) around each peak for XIC extraction.

`--xic-max-qvalue <FLOAT>` (default: `0.01`)

Maximum q-value for precursors to include in XIC output.

`--xic-only` (default: `false`)

Extract XICs for all library precursors without running the search/scoring pipeline. Useful for benchmarking and debugging.

Library Conversion

Standalone utilities to convert spectral libraries between formats. The process exits after conversion.

`--convert-lib <OUTPUT>`

Convert the input --library to fast binary .sagelib format.

`--convert-lib-parquet <OUTPUT>`

Convert a TSV library to Parquet format.

`--convert-sagelib-tsv <OUTPUT>`

Convert a .sagelib binary library back to TSV.

Workflow Options

`--quant-only` (default: `false`)

Run search only (no aggregation/output). Used with --keep-quant-files for two-phase workflows.

`--keep-quant-files` (default: `false`)

Retain intermediate .quant files after search. Required for two-phase merge workflows.

`--save-partial-results` (default: `false`)

Write per-file partial result TSVs during multi-file search.

`--no-search` (default: `false`)

Skip the search phase entirely. Used when re-processing existing quant files.

`--no-bounds` (default: `false`)

Disable RT bounds restriction during search (search full RT range for every precursor).

`--smoothing` (default: `false`)

Apply XIC smoothing during extraction.

`--common-frags` (default: `true`)

Use common fragment ions across runs for consistent quantification.

Peak Quantification

`--quant-peak-find` (default: `true`)

Use peak detection to define quantification boundaries, excluding chromatographic shoulders.

`--best-frag-xic-consecutive` (default: `true`)

Count consecutive non-zero data points (rather than total) for the best fragment XIC metric.

`--median-rt` (default: `false`)

Use median RT from multiple candidates instead of best-scoring peak.

Sub-parameter	Default	Description
`--median-rt-candidates`	`3`	Number of top candidates to consider
`--median-rt-tolerance`	`0.3`	RT tolerance for candidate clustering

Reporting

`--generate-fragment-info` (default: `false`)

Include per-fragment scoring details in the output.

`--report-all-fragments` (default: `false`)

Report all fragment ions (not just top-N used for scoring).

`--report-all-decoys` (default: `false`)

Include decoy hits in the output.

`--no-scan` (default: scan output on)

Disable MS2 scan number output. Use --no-scan to omit scan information.

Resource Management

`-t, --threads <INT>` (default: all CPUs)

Maximum number of threads to use.

`--max-memory-gb <FLOAT>` (default: available memory)

Maximum memory usage in GB.

General

`-v, --verbose` (default: `false`)

Enable detailed debug logging.

`--tracked-precursors <FILE>` (default: `""`)

File with a list of precursor sequences to track with detailed logging. One precursor per line.

14.3.1 SageDIA Command line how-to

SageDIA Parameters

Input / Output

Positional: <mzparquet>

--mzbinary <FILES...>

--library <FILE> (default: "")

--output <FILE> (default: r.tsv)

Search Parameters

--scan-radius <INT> (default: 6)

--min-product-len <INT> (default: 3)

--min-charge <INT> (default: 1)

--max-charge <INT> (default: 255)

--top-n-frags <INT> (default: 12)

--precursor-isotope <INT> (default: 2)

--mass-ppm-tol <FLOAT> (default: 0.0)

--label <STRING> (default: "")

--custom-mod <STRING> (default: "")

Predictor / Scoring

--predictor <MODE> (default: auto)

--xgboost-iterations <INT> (default: 5)

--single-lda (default: false)

--peak-detection <MODE> (default: corr)

--disable-r2-feature (default: false)

Quality Filters

--min-points-per-peak <INT> (default: 3)

--min-points-per-peak-calib <INT> (default: 1)

--light-heavy-min-correlation <FLOAT> (default: 0.5)

Post-Search Boosting

--gaussian-r2-boost (default: true)

--protein-context-boost (default: true)

--filter-one-hit-wonders (default: true)

Match-Between-Runs (MBR)

--mbr (default: true)

Interference Removal

--remove-interference (default: false)

RT Calibration

--rt-calibration-method <METHOD> (default: lowess)

--calibration-correlation <MODE> (default: L)