Search TFs
or Genes
Construction
and Validation
Optimal
Promoter Size
Download Citation and
Contact

We defined the optimum promoter as the maximum region around the Transcriptional Start Site (TSS) that produces similar sensitivity and specificity of TF-target gene predictions as the core promoter (i.e., ±500bp of the TSS). To find this optimum size, we fixed the downstream (3') promoter boundary at +500bp and varied the upstream (5') promoter boundary (-1, -2.5, -5, -10 and -20Kbp; Figure 1A). Relative to the core promoter, a significant decrease in the sensitivity and specificity (p-value = 0.05) was observed when the upstream promoter boundary increased beyond -5Kbp (p-value = 2.9 x 10-3; Figure 1A; Supplementary Table {DefiningPromoter}). We then set the upstream promoter boundary at -5Kbp and varied the downstream promoter boundary (+1, +2.5, +5, +10, +20Kbp, and all genic sequences; Figure 1B). We observed a significant decrease in sensitivity and specificity when the downstream promoter went beyond +5Kbp (p-value = 1.5 x 10-2; Figure 1B; Supplementary Table {DefiningPromoter}). Thus, we empirically defined the optimal promoter search space for potential TF binding sites to be ±5Kbp from the TSS of human genes, and this was the promoter size used to pre-compute the mechanistic TF regulatory network (i.e., a rigorously tested database of TF-target gene interactions). Using glioblastoma multiforme (GBM) as an example, we demonstrated how this database can be used to infer a comprehensive causal TF regulatory network for any complex disease.

Figure 1. Results of varying promoter region on ROC AUC using ChIP-seq as gold standard. A. Comparisons of ROC AUCs from increasing upstream promoter lengths were made relative to the core promoter size of ±500bp. A promoter length exceeding the red line indicates a significant reduction in ROC AUC (p-value < 0.05). B. Comparisons of ROC AUCs from increasing downstream promoter lengths were made relative to the promoter size of -5Kbp and +500bp. A promoter length exceeding the red line indicates a significant reduction in ROC AUC (p-value < 0.05).

Need help? Please contact cplaisier(at)systemsbiology.org if you have any questions, comments or concerns.
Developed at the Institute for Systems Biology in the Baliga Lab.
A Django site.   Powered by PostgreSQL