Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Jianfeng Sun

Postdoctoral Research Associate in Single-cell Sequencing Analysis

I obtained my Ph.D. (Nov. 2017 - Feb. 2021, view my Ph.D. thesis) in deep learning-based structural biology from the Technical University of Munich, Germany. Before that, I received my Bachelor's degree (B. Sci., Sep. 2010 - June. 2014) in computational mathematics from Nanjing Tech University, China, and was subsequently trained on a Master program (M. Eng., Sep. 2014 - June. 2016) in software engineering and bioinformatics at Beijing Forest University (BJFU), followed by a one-year successive master-doctor training program (Sep. 2016 - June. 2017) at BJFU. Since Jul. 2021, I have been a postdoctoral researcher in Prof. Cribbs' lab in NDORMS at the University of Oxford. At my Ph.D. stage, I focused on protein-protein interaction networks, and structural and evolutionary biology, with the aim of promoting illuminating their biological roles in cellular activities. I was fascinated by deciphering intricate biological networks by capitalizing on artificial intelligence-based algorithms and other mathematical models. I am now active in the area of algorithm design and computational analysis for single-cell sequencing data.

The final sequencing library impurities that arise from mixing PCR duplicates and artifacts have an impact on the quantification estimation accuracy for DNA fragments or transcripts. In order to eliminate the PCR duplicates, unique molecular identifiers (UMIs) have been applied experimentally to distinguishing true PCR duplicates from the fragments that are used to be sequenced. The accurate localization of the unique fragments via UMIs is however hampered by those erroneous UMIs during PCR amplification and sequencing. Thus, computational and mathematical methods have been proposed to circumvent the problem. In addition, novel sequencing technologies are emerging as cost-effective solutions to long-read sequencing at the cost of high accuracy, which has prompted massive error-prone long-reads. In purpose-built experiments based on the new sequencing technologies, the error-correction performance of existing methods for UMI identification is found to be unsatisfactory especially when more stringent experiment settings are imposed on UMIs, e.g., highly error-prone UMIs. Therefore, I have recently branched out into algorithmic strategy design for improving UMI identification both before and after sequencing. With the rapidly growing volume of sequencing data, more powerful computational workflows for analyzing sequencing data are also about to be built.