Guidelines for CRISPR screens of noncoding elements published!

The ENCODE non-coding CRISPR manuscript is now fully live, and you can read the entire paper here in Nature Methods!  If you’d like a more “popsci” version, read about it here. Before we get to the science, shout out to the six co-first authors, David Yao, Josh Tycko, Jin Woo Oh, Lexi Bounds, Sager Gosai, and Lazaros Lataniotis. This was a collaboration across 13+ total labs, multiple time zones, and 5 years, and really made the most progress during the pandemic. Truly amazing how much slack, a (currently) 378 page Google Doc, and motivated brilliant scientists can keep science moving!

If you’re thinking of doing a non-coding CRISPR screen, this paper is here to help. There are many types of design, perturbation, phenotyping, and analysis strategies for non-coding CRISPR screens to choose from. We tried to compare and contrast the many approaches in the ENCODE CRISPR Database. This is likely the largest set of non-coding CRISPR screens, all interoperable with a common format we proposed, representing data from >100 screens comprising 500,000+ perturbations along with >300 verified CRE-gene links. We think this will be useful for a lot of folks in the community!

First, we were able to learn the fundamentals of gene regulation and make recommendations for future non-coding CRISPR screens. Nearly all functional cis-regulatory elements (CREs) identified by CRISPR screens are in a mapped epigenetic CRE (that is, loci identified with ATAC, DHS, or H3K27ac). This was predicted, but very useful to confirm.

Importantly, the largest effects are seen by targeting accessibility peak summits. This is a critical component of our proposed sgRNA design rules. We provide experimental guidelines to accurately detect CREs with variable, often low, transcriptional effects. These power calculations are useful for others planning screens, especially larger single-cell screens where every guide counts. We prospectively designed the best guides using our recommendation for all ENCODE cCREs : they are publicly available here.

Additionally, we empirically determined critical methodological guidelines—such as suggestions for cellular coverage and sequencing depth—using diverse types of non-coding CRISPR screens. We also
benchmarked five screen analysis tools, and found that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity sgRNAs. CASA outperformed the other 4 peak callers in identifying the smallest set of CREs with the highest log2FC. We also uncovered how perturbation dynamics can cause false negatives, and make suggestions for selecting timepoints for a growth screen. (Comparing the last timepoint with the initial plasmid pool yields the most true positives.) We discover a previously undescribed DNA strand-bias for CRISPRi in transcribed gene bodies: coding strand-targeting sgRNAs have significantly greater effects in these regions! Very important for future screen design and analysis: we don't want to be blind to intronic CREs! All of this data will be available on the ENCODE portal, in a logical, easy to access fashion, most with uniform file formats and results.

Previous
Previous

Complex and molecular trait variants paper on BioRxiv!

Next
Next

New Postgrads Join the Lab