GitHunt

scfind - Fast searches of large collections of single cell data

Single cell technologies have made it possible to profile millions of cells, but for these resources to be useful they must be easy to query and access. To facilitate interactive and intuitive access to single cell data we have developed scfind (source available at https://github.com/hemberg-lab/scfind), a search engine for cell atlases. Scfind can be used to evaluate marker genes, to perform in silico gating, and to identify both cell-type specific and housekeeping genes. An interactive interface website with 9 single cell datasets is available at https://scfind.sanger.ac.uk.

Q: What is this?

A: scfind is a search engine that makes single cell data accessible to a wide range of users by enabling sophisticated queries for large datasets through an interface which is both very fast and familiar to users from any background.

Q: How to install/run scfind?

A: If you would like to install the latest development version of scfind please install it from the GitHub repository:

# Linux and Mac users, run this in your R session:
install.packages("devtools")
devtools::install_github("hemberg-lab/scfind")

library("scfind")

# For Windows users:
# Please install the latest version of Rtools at https://cran.r-project.org/bin/windows/Rtools/ prior to installation of scfind

Update The latest version (3.5.1) of scfind released on 3rd October 2019 has provided 2 datasets and 2 pre-processed scfind indexes as example. To update the latest version:

install.packages("devtools")
devtools::install_github("hemberg-lab/scfind", force = TRUE)

Q: Where can I find the scfind example datasets and indexes?

A: The latest version of the package has provided a list of example SingleCellExperiment objects and scfind indexes created from the The Tabula Muris Consortium for your first scfind experience:

library("scfind")

# List of `Tabula Muris (FACS)` `SingleCellExperiment` objects
data(tmfacs)

# List of `Tabula Muris (10X)` `SingleCellExperiment` objects
data(tm10x)

The detail of building scfind index from SingleCellExperiment object is described in this page.

library("scfind")
library("SingleCellExperiment")

# To build the `Bladder` index
sce.bladder <- readRDS(url(tmfacs[1]))
scfind.index <-  buildCellTypeIndex(sce = sce.bladder, 
                             cell.type.label = "cell_type1",
                             dataset.name = "Bladder", 
                             assay.name = "counts")

You can use the mergeDataset function to combine more than one dataset into one super index. The function saveObject allows you to save your index for future use.

To Quick Start scfind with pre-computed indexes:

# `scfind` index of the `Tabula Muris (FACS)` dataset
data(ExampleIndex_TabulaMurisFACS)

scfind.index.tmfacs <- loadObject(file = url(ExampleIndex_TabulaMurisFACS))

# `scfind` index of the `Tabula Muris (10X)` dataset
data(ExampleIndex_TabulaMuris10X)

scfind.index.tm10x <- loadObject(file = url(ExampleIndex_TabulaMuris10X))

Q: Where can I report bugs, comments, issues or suggestions?

A: Please use this page.

Q: Is scfind published?

A: Not yet, but a copy of scfind manuscript is available on bioRxiv.

Q: What is scfind licence?

A: GPL-3

Languages

R50.9%C++49.1%C0.0%

Contributors

GNU General Public License v3.0
Created June 24, 2020
Updated June 24, 2020
evanbiederstedt/scfind | GitHunt