Read AnnData (.h5ad) files for single-cell genomics data analysis, with support for local and remote (HTTP/HTTPS/S3) files
Maintainer(s):
honicky
Installing and Loading
INSTALL anndata FROM community;
LOAD anndata;
Example
-- Attach an AnnData file
ATTACH 'data.h5ad' AS scdata (TYPE ANNDATA);
-- Query cell metadata
SELECT * FROM scdata.obs LIMIT 10;
-- Query gene metadata
SELECT * FROM scdata.var LIMIT 10;
-- Query expression matrix
SELECT * FROM scdata.X LIMIT 10;
-- Detach when done
DETACH scdata;
About anndata
The AnnData extension provides read-only access to AnnData (.h5ad) files, the standard format for single-cell genomics data.
ATTACH Syntax
-- Local file
ATTACH 'file.h5ad' AS name (TYPE ANNDATA);
-- Remote file via HTTPS
ATTACH 'https://example.com/data.h5ad' AS name (TYPE ANNDATA);
-- S3 file (requires httpfs extension and credentials)
INSTALL httpfs;
LOAD httpfs;
CREATE SECRET s3_secret (TYPE S3, KEY_ID 'xxx', SECRET 'xxx', REGION 'us-east-1');
ATTACH 's3://bucket/data.h5ad' AS name (TYPE ANNDATA);
-- With custom gene name/ID columns
ATTACH 'file.h5ad' AS name (TYPE ANNDATA, VAR_NAME_COLUMN 'gene_symbols', VAR_ID_COLUMN 'ensembl_id');
Available Tables
obs- Observation (cell) metadatavar- Variable (gene) metadataX- Expression matrix (genes as columns)obsm_*- Dimensional reductions (PCA, UMAP, etc.)varm_*- Variable embeddingslayers_*- Alternative expression matricesobsp_*- Cell-cell pairwise matricesvarp_*- Gene-gene pairwise matricesuns- Unstructured metadata
Table Functions
-- Core data
SELECT * FROM anndata_scan_obs('file.h5ad');
SELECT * FROM anndata_scan_var('file.h5ad');
SELECT * FROM anndata_scan_x('file.h5ad');
-- Dimensional reductions
SELECT * FROM anndata_scan_obsm('file.h5ad', 'X_pca');
SELECT * FROM anndata_scan_obsm('file.h5ad', 'X_umap');
-- Layers
SELECT * FROM anndata_scan_layers('file.h5ad', 'raw');
-- File info
SELECT * FROM anndata_info('file.h5ad');
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| anndata_info | table | Returns metadata and structure information about an AnnData (.h5ad) file, including the number of observations, variables, available layers, embeddings, and other components. | NULL | [SELECT * FROM anndata_info('data.h5ad');] |
| anndata_scan_obs | table | Scans the observation (cell) metadata from an AnnData file. Returns all columns from the obs DataFrame including cell barcodes, cell types, and other annotations. | NULL | [SELECT * FROM anndata_scan_obs('data.h5ad') LIMIT 5;] |
| anndata_scan_var | table | Scans the variable (gene) metadata from an AnnData file. Returns all columns from the var DataFrame including gene IDs, gene names, and other annotations. | NULL | [SELECT * FROM anndata_scan_var('data.h5ad') LIMIT 5;] |
| anndata_scan_x | table | Scans the main expression matrix (X) from an AnnData file. Returns the matrix with observation indices as rows and gene names as columns. Automatically handles both dense and sparse matrix formats. | NULL | [SELECT obs_idx, CD3D, CD19, CD14 FROM anndata_scan_x('data.h5ad') LIMIT 5;] |
| anndata_scan_obsm | table | Scans observation embeddings (obsm) from an AnnData file. Common embeddings include PCA (X_pca), UMAP (X_umap), and t-SNE (X_tsne). | NULL | [SELECT * FROM anndata_scan_obsm('data.h5ad', 'X_umap') LIMIT 5;] |
| anndata_scan_varm | table | Scans variable embeddings (varm) from an AnnData file. These are gene-level embeddings such as PCA loadings. | NULL | [SELECT * FROM anndata_scan_varm('data.h5ad', 'PCs') LIMIT 5;] |
| anndata_scan_layers | table | Scans alternative expression matrices (layers) from an AnnData file. Common layers include raw counts, normalized data, or scaled data. | NULL | [SELECT * FROM anndata_scan_layers('data.h5ad', 'raw') LIMIT 5;] |
| anndata_scan_obsp | table | Scans observation pairwise matrices (obsp) from an AnnData file. These typically contain cell-cell distance or connectivity matrices. | NULL | [SELECT * FROM anndata_scan_obsp('data.h5ad', 'distances') LIMIT 5;] |
| anndata_scan_varp | table | Scans variable pairwise matrices (varp) from an AnnData file. These contain gene-gene relationship matrices. | NULL | [SELECT * FROM anndata_scan_varp('data.h5ad', 'correlations') LIMIT 5;] |
| anndata_hello | scalar | NULL | NULL | NULL |
| anndata_version | scalar | NULL | NULL | NULL |
| anndata_scan_uns | table | NULL | NULL | NULL |