Search Shortcut cmd + k | ctrl + k
anndata

Read AnnData (.h5ad) files for single-cell genomics data analysis, with support for local and remote (HTTP/HTTPS/S3) files

Maintainer(s): honicky

Installing and Loading

INSTALL anndata FROM community;
LOAD anndata;

Example

-- Attach an AnnData file
ATTACH 'data.h5ad' AS scdata (TYPE ANNDATA);

-- Query cell metadata
SELECT * FROM scdata.obs LIMIT 10;

-- Query gene metadata
SELECT * FROM scdata.var LIMIT 10;

-- Query expression matrix
SELECT * FROM scdata.X LIMIT 10;

-- Detach when done
DETACH scdata;

About anndata

The AnnData extension provides read-only access to AnnData (.h5ad) files, the standard format for single-cell genomics data.

ATTACH Syntax

-- Local file
ATTACH 'file.h5ad' AS name (TYPE ANNDATA);

-- Remote file via HTTPS
ATTACH 'https://example.com/data.h5ad' AS name (TYPE ANNDATA);

-- S3 file (requires httpfs extension and credentials)
INSTALL httpfs;
LOAD httpfs;
CREATE SECRET s3_secret (TYPE S3, KEY_ID 'xxx', SECRET 'xxx', REGION 'us-east-1');
ATTACH 's3://bucket/data.h5ad' AS name (TYPE ANNDATA);

-- With custom gene name/ID columns
ATTACH 'file.h5ad' AS name (TYPE ANNDATA, VAR_NAME_COLUMN 'gene_symbols', VAR_ID_COLUMN 'ensembl_id');

Available Tables

  • obs - Observation (cell) metadata
  • var - Variable (gene) metadata
  • X - Expression matrix (genes as columns)
  • obsm_* - Dimensional reductions (PCA, UMAP, etc.)
  • varm_* - Variable embeddings
  • layers_* - Alternative expression matrices
  • obsp_* - Cell-cell pairwise matrices
  • varp_* - Gene-gene pairwise matrices
  • uns - Unstructured metadata

Table Functions

-- Core data
SELECT * FROM anndata_scan_obs('file.h5ad');
SELECT * FROM anndata_scan_var('file.h5ad');
SELECT * FROM anndata_scan_x('file.h5ad');

-- Dimensional reductions
SELECT * FROM anndata_scan_obsm('file.h5ad', 'X_pca');
SELECT * FROM anndata_scan_obsm('file.h5ad', 'X_umap');

-- Layers
SELECT * FROM anndata_scan_layers('file.h5ad', 'raw');

-- File info
SELECT * FROM anndata_info('file.h5ad');

Added Functions

function_name function_type description comment examples
anndata_info table Returns metadata and structure information about an AnnData (.h5ad) file, including the number of observations, variables, available layers, embeddings, and other components. NULL [SELECT * FROM anndata_info('data.h5ad');]
anndata_scan_obs table Scans the observation (cell) metadata from an AnnData file. Returns all columns from the obs DataFrame including cell barcodes, cell types, and other annotations. NULL [SELECT * FROM anndata_scan_obs('data.h5ad') LIMIT 5;]
anndata_scan_var table Scans the variable (gene) metadata from an AnnData file. Returns all columns from the var DataFrame including gene IDs, gene names, and other annotations. NULL [SELECT * FROM anndata_scan_var('data.h5ad') LIMIT 5;]
anndata_scan_x table Scans the main expression matrix (X) from an AnnData file. Returns the matrix with observation indices as rows and gene names as columns. Automatically handles both dense and sparse matrix formats. NULL [SELECT obs_idx, CD3D, CD19, CD14 FROM anndata_scan_x('data.h5ad') LIMIT 5;]
anndata_scan_obsm table Scans observation embeddings (obsm) from an AnnData file. Common embeddings include PCA (X_pca), UMAP (X_umap), and t-SNE (X_tsne). NULL [SELECT * FROM anndata_scan_obsm('data.h5ad', 'X_umap') LIMIT 5;]
anndata_scan_varm table Scans variable embeddings (varm) from an AnnData file. These are gene-level embeddings such as PCA loadings. NULL [SELECT * FROM anndata_scan_varm('data.h5ad', 'PCs') LIMIT 5;]
anndata_scan_layers table Scans alternative expression matrices (layers) from an AnnData file. Common layers include raw counts, normalized data, or scaled data. NULL [SELECT * FROM anndata_scan_layers('data.h5ad', 'raw') LIMIT 5;]
anndata_scan_obsp table Scans observation pairwise matrices (obsp) from an AnnData file. These typically contain cell-cell distance or connectivity matrices. NULL [SELECT * FROM anndata_scan_obsp('data.h5ad', 'distances') LIMIT 5;]
anndata_scan_varp table Scans variable pairwise matrices (varp) from an AnnData file. These contain gene-gene relationship matrices. NULL [SELECT * FROM anndata_scan_varp('data.h5ad', 'correlations') LIMIT 5;]
anndata_hello scalar NULL NULL NULL
anndata_version scalar NULL NULL NULL
anndata_scan_uns table NULL NULL NULL