Add sampleData argument to store multi sample's metadata information (bulk, single-cell, spatial) in se object by lingminhao · Pull Request #542 · GoekeLab/bambu

lingminhao · 2026-02-15T14:09:08Z

This PR allows the user to provide sample-specific metadata using the sampleData argument.

Format Supported:

Bulk Data: one single .csv metadata file with a mandatory sampleName column is sufficient. Every row then contains metadata information for a sample. Multiple .csv file for each sample is possible, but not necessary.
Single-Cell / Spatial Data: one .csv metadata file per single-cell/spatial sample, each containing a mandatory barcode column.

If a specific sample lacks metadata, users can simply pass a NA value at the corresponding index in the input vector (e.g., c("metadata_sample1.csv", NA, "metadata_sample3.csv")).

Users can then define any additional metadata columns as needed in the metadata .csv file. The metadata will be stored in the colData of the se SummarizedExperiment object

Copilot

Pull request overview

Adds a sampleData argument to bambu() to allow users to attach per-sample (bulk) or per-sample/per-barcode (single-cell/spatial) metadata from CSV files into the output SummarizedExperiment’s colData.

Changes:

Adds sampleData to bambu() and threads it into assignReadClasstoTranscripts().
Reworks generateColData() to left-join user-provided CSV metadata by sampleName (bulk) or barcode (demultiplexed).
Changes multi-sample SE assembly to carry forward per-sample colData into the combined SE.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
R/bambu.R	Adds `sampleData` param and passes it through quantification path; changes how combined `colData` is built.
R/bambu-assignDist.R	Extends `assignReadClasstoTranscripts()` signature to accept `sampleData` and uses new `generateColData()`.
R/bambu_utilityFunctions.R	Updates `combineCountSes()` to accept external colData list and rewrites `generateColData()` to join CSV metadata.
R/bambu-processReads_utilityConstructReadClasses.R	Formatting/brace cleanup in read-class construction.
R/bambu-processReads.R	Removes an unused `warnings` placeholder variable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

R/bambu_utilityFunctions.R

R/bambu-assignDist.R

R/bambu.R

R/bambu_utilityFunctions.R

R/bambu.R

…ame as function)

R/bambu_utilityFunctions.R

R/bambu.R

…or variable clarity

R/bambu_utilityFunctions.R

ch99l

After the requested code changes, code runs as expected.

R/bambu_utilityFunctions.R

R/bambu.R

R/bambu_utilityFunctions.R

jonathangoeke · 2026-03-31T03:50:22Z

R/bambu-processReads.R

-    mcols(readGrgList)$CB <- as.factor(mcols(readGrgList)$CB)
+
    if(!isFALSE(demultiplexed)){ 
+        mcols(readGrgList)$CB <- as.factor(mcols(readGrgList)$CB)


should be done when this is first defined?

jonathangoeke · 2026-03-31T03:51:03Z

R/bambu-processReads.R

+          sampleName = names(bam.file)[1]
+        )
+    }
+


sampleName <- names(bam.file)[1] already defined but not used? (line 176)

simplify code here?
metadata(se)$sampleData <- tibble( id = names(bam.file)[1], sampleName = id) if (demultiplexed) { metadata(se)$sampleData %>% mutate( barcode = levels(mcols(readGrgList)$CB) id = paste(sampleName, barcode, sep='_') ) }

jonathangoeke · 2026-03-31T03:51:15Z

R/bambu-processReads.R

        j = as.numeric(names(unlist(counts.table))),
        x = unlist(counts.table),
-        dims = c(nrow(eqClasses), length(metadata(readClassFile)$samples)))
+        dims = c(nrow(eqClasses), length(metadata(readClassFile)$sampleData$id)))


nrow(metadata(readClassFile)$sampleData) ? or define sampleIds <-.... then use length(sampleIds)

jonathangoeke · 2026-03-31T03:51:47Z

R/bambu_utilityFunctions.R

+    colData <- tibble(
+        id = paste(metadata(readClassList)$sampleData$sampleName, metadata(readClassList)$sampleData$barcode, sep = '_'),
+        sampleName = metadata(readClassList)$sampleData$sampleName,
+        barcode = metadata(readClassList)$sampleData$barcode
+    )
  } else{
-    colData$sampleName <- samples
+    colData <- tibble(id = metadata(readClassList)$sampleData$sampleName, sampleName = metadata(readClassList)$sampleData$sampleName)


duplicated? same as above

jonathangoeke

see comments regarding code changes

lingminhao added 5 commits February 15, 2026 21:33

tidy up code

14f5547

refactor generateColData to take sampleData as argument

d63fdae

refactor combineCountSes to inherit colData directly from quantData

f86ce9c

update colData for pseudobulk single-cell

ed26b1f

add sampleData argument

f9554b5

lingminhao changed the base branch from devel to devel_pre_v4 February 15, 2026 14:09

lingminhao requested a review from Copilot February 15, 2026 14:09

lingminhao assigned ch99l Feb 15, 2026

Copilot started reviewing on behalf of lingminhao February 15, 2026 14:10 View session

lingminhao added the bambu-dev Feature is implemented in development branch label Feb 15, 2026

Copilot AI reviewed Feb 15, 2026

View reviewed changes

GoekeLab deleted a comment from Copilot AI Feb 17, 2026

lingminhao added 4 commits February 19, 2026 09:02

remove spatial argument from bambu

0307855

rename colData parameter combineCountSes to colDataList (avoid same n…

8e25154

…ame as function)

update bambu sampleData parameter description

c29eab4

refine sampleData input check description

bfa131e

ch99l requested changes Feb 20, 2026

View reviewed changes

R/bambu_utilityFunctions.R Show resolved Hide resolved

R/bambu_utilityFunctions.R Outdated Show resolved Hide resolved

R/bambu_utilityFunctions.R Outdated Show resolved Hide resolved

R/bambu.R Outdated Show resolved Hide resolved

lingminhao added 3 commits February 20, 2026 14:00

tidy up spatial & sampleData argument

9c2d8ba

change sampleData to sampleMetadata in assignReadClasstoTranscripts f…

fda300b

…or variable clarity

fix bug: omit the check for NA elements in sampleData

e80cb39

ch99l requested changes Mar 2, 2026

View reviewed changes

R/bambu_utilityFunctions.R Show resolved Hide resolved

allow . csv/.tsv/.txt file input type in sampleData

7d60435

lingminhao force-pushed the generateColData branch from 65c11ec to 7d60435 Compare March 3, 2026 07:19

ch99l approved these changes Mar 9, 2026

View reviewed changes

ch99l requested a review from SuiYue-2308 March 9, 2026 10:04

jonathangoeke reviewed Mar 23, 2026

View reviewed changes

R/bambu_utilityFunctions.R Outdated Show resolved Hide resolved

R/bambu.R Show resolved Hide resolved

ch99l requested changes Mar 23, 2026

View reviewed changes

R/bambu_utilityFunctions.R Outdated Show resolved Hide resolved

lingminhao added 3 commits March 27, 2026 10:02

refactor: store sampleData in readClassList for parsing

8f7f506

update comment to describe CB/UMI parsing from bam

547e034

change priority in CB & UMI name extraction from bam file

586cdb3

lingminhao added 2 commits March 27, 2026 16:19

fix: standardize list access for all sample sizes

4bd48bf

remove redundant code

807ad06

ch99l approved these changes Mar 31, 2026

View reviewed changes

jonathangoeke reviewed Mar 31, 2026

View reviewed changes

Conversation

lingminhao commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ch99l left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jonathangoeke Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

jonathangoeke Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

jonathangoeke Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

jonathangoeke Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

jonathangoeke left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lingminhao commented Feb 15, 2026 •

edited

Loading