I am working through the tutorial about RNA-Seq Data Pathway and Gene-set Analysis Workflows but the "Quick start" example is not running correctly in my system. Specifically, it gets stuck at the summarizeOverlaps step, where the following error occurs:
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
'data' must be of a vector type, was 'NULL'
Here is the code I am running (lifted verbatim from the tutorial):
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
exByGn <- exonsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, "gene")
library(GenomicAlignments)
fls <- list.files("tophat_all/", pattern="bam$", full.names =T)
bamfls <- BamFileList(fls)
flag <- scanBamFlag(isSecondaryAlignment=FALSE, isProperPair=TRUE)
param <- ScanBamParam(flag=flag)
gnCnt <- summarizeOverlaps(exByGn, bamfls, mode="Union",
ignore.strand=TRUE, singleEnd=FALSE, param=param)
This is the result of running traceback() after the error occurs:
8: array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),
NULL) else NULL)
7: as.matrix.default(do.call(cbind, cts))
6: as.matrix(do.call(cbind, cts))
5: as.matrix(do.call(cbind, cts))
4: .dispatchBamFiles(features, reads, mode, ignore.strand, inter.feature = inter.feature,
singleEnd = singleEnd, fragments = fragments, param = param,
preprocess.reads = preprocess.reads, ...)
3: .local(features, reads, mode, ignore.strand, ...)
2: summarizeOverlaps(exByGn, bamfls, mode = "Union", ignore.strand = TRUE,
singleEnd = FALSE, param = param)
1: summarizeOverlaps(exByGn, bamfls, mode = "Union", ignore.strand = TRUE,
singleEnd = FALSE, param = param)
And this is my sessionInfo in case it helps:
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicAlignments_1.28.0 Rsamtools_2.8.0
[3] Biostrings_2.60.2 XVector_0.32.0
[5] SummarizedExperiment_1.22.0 MatrixGenerics_1.4.2
[7] matrixStats_0.60.0 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[9] GenomicFeatures_1.44.0 AnnotationDbi_1.54.1
[11] Biobase_2.52.0 GenomicRanges_1.44.0
[13] GenomeInfoDb_1.28.1 IRanges_2.26.0
[15] S4Vectors_0.30.0 BiocGenerics_0.38.0
loaded via a namespace (and not attached):
[1] estimability_1.3 rappdirs_0.3.3 rtracklayer_1.52.0
[4] pbdZMQ_0.3-5 R.methodsS3_1.8.1 tidyr_1.1.3
[7] ggplot2_3.3.5 bit64_4.0.5 knitr_1.33
[10] DelayedArray_0.18.0 R.utils_2.10.1 data.table_1.14.0
[13] rpart_4.1-15 KEGGREST_1.32.0 RCurl_1.98-1.3
[16] AnnotationFilter_1.16.0 generics_0.1.0 cowplot_1.1.1
[19] RSQLite_2.2.7 shadowtext_0.0.8 chron_2.3-56
[22] bit_4.0.4 enrichplot_1.12.2 base64url_1.4
[25] xml2_1.3.2 httpuv_1.6.1 assertthat_0.2.1
[28] viridis_0.6.1 xfun_0.25 hms_1.1.0
[31] evaluate_0.14 promises_1.2.0.1 DEoptimR_1.0-9
[34] fansi_0.5.0 restfulr_0.0.13 progress_1.2.2
[37] caTools_1.18.2 dendextend_1.15.1 dbplyr_2.1.1
[40] readxl_1.3.1 igraph_1.2.6 DBI_1.1.1
[43] geneplotter_1.70.0 htmlwidgets_1.5.3 purrr_0.3.4
[46] ellipsis_0.3.2 crosstalk_1.1.1 RCytoscape_1.24.1
[49] dplyr_1.0.7 backports_1.2.1 signal_0.7-7
[52] annotate_1.70.0 biomaRt_2.48.2 vctrs_0.3.8
[55] ensembldb_2.16.4 abind_1.4-5 cachem_1.0.5
[58] ggforce_0.3.3 Gviz_1.36.2 BSgenome_1.60.0
[61] robustbase_0.93-8 checkmate_2.0.0 emmeans_1.6.2-1
[64] treeio_1.16.1 prettyunits_1.1.1 cluster_2.1.2
[67] DOSE_3.18.1 fuzzyjoin_0.1.6 pacman_0.5.1
[70] ape_5.5 IRdisplay_1.0 lazyeval_0.2.2
[73] crayon_1.4.1 SetRank_1.1.0 uchardet_1.1.0
[76] genefilter_1.74.0 pkgconfig_2.0.3 tweenr_1.0.2
[79] qpcR_1.4-1 nlme_3.1-152 ProtGenerics_1.24.0
[82] nnet_7.3-16 rlang_0.4.11 RJSONIO_1.3-1.4
[85] lifecycle_1.0.0 downloader_0.4 filelock_1.0.2
[88] BiocFileCache_2.0.0 regutools_1.4.0 AnnotationHub_3.0.1
[91] dichromat_2.0-0 polyclip_1.10-0 cellranger_1.1.0
[94] graph_1.70.0 aplot_0.0.6 Matrix_1.3-4
[97] IRkernel_1.2 carData_3.0-4 boot_1.3-28
[100] base64enc_0.1-3 png_0.1-7 viridisLite_0.4.0
[103] rjson_0.2.20 bitops_1.0-7 R.oo_1.24.0
[106] KernSmooth_2.23-20 dplR_1.7.2 blob_1.2.2
[109] rgl_0.107.10 afex_1.0-1 stringr_1.4.0
[112] qvalue_2.24.0 jpeg_0.1-9 scales_1.1.1
[115] memoise_2.0.0 magrittr_2.0.1 plyr_1.8.6
[118] gplots_3.1.1 zlibbioc_1.38.0 scatterpie_0.1.6
[121] compiler_4.1.0 BiocIO_1.2.0 RColorBrewer_1.1-2
[124] lme4_1.1-27.1 DESeq2_1.32.0 lmerTest_3.1-3
[127] patchwork_1.1.1 htmlTable_2.2.1 Formula_1.2-4
[130] MASS_7.3-54 tidyselect_1.1.1 stringi_1.7.3
[133] forcats_0.5.1 XMLRPC_0.3-0 gage_2.42.0
[136] yaml_2.2.1 GOSemSim_2.18.1 locfit_1.5-9.4
[139] ggrepel_0.9.1 latticeExtra_0.6-29 grid_4.1.0
[142] VariantAnnotation_1.38.0 fastmatch_1.1-3 tools_4.1.0
[145] rio_0.5.27 rstudioapi_0.13 uuid_0.1-4
[148] foreign_0.8-81 genbankr_1.20.0 gridExtra_2.3
[151] farver_2.1.0 ggraph_2.0.5 rvcheck_0.1.8
[154] digest_0.6.27 BiocManager_1.30.16 shiny_1.6.0
[157] Rcpp_1.0.7 car_3.0-11 BiocVersion_3.13.1
[160] later_1.2.0 httr_1.4.2 minpack.lm_1.2-1
[163] biovizBase_1.40.0 ReadqPCR_1.38.0 colorspace_2.0-2
[166] XML_3.99-0.6 splines_4.1.0 tidytree_0.3.4
[169] graphlayouts_0.7.1 xtable_1.8-4 ggtree_3.0.2
[172] jsonlite_1.7.2 nloptr_1.2.2.2 tidygraph_1.2.0
[175] R6_2.5.0 Hmisc_4.5-0 pillar_1.6.2
[178] htmltools_0.5.1.1 mime_0.11 NormqPCR_1.38.0
[181] glue_1.4.2 fastmap_1.1.0 minqa_1.2.4
[184] clusterProfiler_4.0.2 BiocParallel_1.26.1 interactiveDisplayBase_1.30.0
[187] fgsea_1.18.0 mvtnorm_1.1-2 utf8_1.2.2
[190] lattice_0.20-44 tibble_3.1.3 numDeriv_2016.8-1.1
[193] curl_4.3.2 gtools_3.9.2 RamiGO_1.22.0
[196] zip_2.2.0 GO.db_3.13.0 openxlsx_4.2.4
[199] survival_3.2-11 repr_1.1.3 munsell_0.5.0
[202] DO.db_2.9 GenomeInfoDbData_1.2.6 RCy3_2.12.3
[205] haven_2.4.3 reshape2_1.4.4 gtable_0.3.0
I have checked the version of all packages and the Bioconductor release and everything seems up-to-date. I have done a bit of browsing and I notice that this same error popped up back in 2016 in http://seqanswers.com/forums/archive/index.php/t-64313.html but at that time it was not solved.
Any advice on how I should proceed? If I can't even go through the tutorial I doubt I will be able to use the package on my own data.
I am working through the tutorial about RNA-Seq Data Pathway and Gene-set Analysis Workflows but the "Quick start" example is not running correctly in my system. Specifically, it gets stuck at the summarizeOverlaps step, where the following error occurs:
Here is the code I am running (lifted verbatim from the tutorial):
This is the result of running traceback() after the error occurs:
And this is my sessionInfo in case it helps:
I have checked the version of all packages and the Bioconductor release and everything seems up-to-date. I have done a bit of browsing and I notice that this same error popped up back in 2016 in http://seqanswers.com/forums/archive/index.php/t-64313.html but at that time it was not solved.
Any advice on how I should proceed? If I can't even go through the tutorial I doubt I will be able to use the package on my own data.