Members Research Teaching Join Group

Molecular Based Mathematical Biology

The last century has witnessed the tremendous advancement of Biological Sciences. The availability of massive biological data, high-performance computers, efficient computational algorithms, and  mathematical and physical models have paved the way for Biological Sciences to undertake a historic transition from being qualitative, phenomenological, and descriptive to being quantitative, analytical, and predictive. Under this transition, modern Mathematical Biology will be fundamentally changed from macroscale modelings (of species, population, disease, blood fluid, etc) to molecular based analysis (of protein, DNA, gene, virus, etc). A brief introduction of the Molecular Based Mathematical Biology can be found in SIAM news Sep 2016, Dec 2017, and Prof Wei's Harvard talk.

Our group focuses on Molecular Based Mathematical Biology. We use computational tools from PDE, Differential Geometry, Algebraic Topology and Statistical Learning to study the biomolecular structure, flexibility, dynamics, and functions. In particular, we are interested in topology based machine learning for biomolecular data analysis and chromosome hierarchical structures. We sincerely welcome highly motivated students and postdocs to join our group.

Research Interests
Topological Modeling and Analysis 
  • Persistent homology analysis of big data in biomolecules
Persistent_Homology Persistent homology is, for the first time, employed to quantitatively predict the stability of the fullerene molecules. We study the ground-state structures of fullerene molecules and the relative stability of fullerene isomers. We find the heat of formation energy is related to the local hexagonal cavities of small fullerenes, while the total curvature energies of fullerene isomers are associated with their sphericities, which are measured by the lengths of their long-lived Betti-2 bars. Persistent homology is then introduced for extracting molecular topological fingerprints (MTFs). MTFs are utilized for protein characterization, identification and classification. Based on the correlation between protein compactness, rigidity and connectivity, we propose an accumulated bar length generated from persistent topological invariants for the quantitative modeling of protein flexibility. To this end, a correlation matrix based filtration is developed. This approach gives rise to an accurate prediction of the optimal characteristic distance used in protein B-factor analysis. Further, MTFs are employed to characterize protein topological evolution during protein folding and quantitatively predict the protein folding stability. An excellent consistence between our persistent homology prediction and molecular dynamics simulation is found.                                                    

Geometric and Variational modeling
  • Variational multi-scale models
Variational_multiscale We develop geometric modeling and computational algorithm for biomolecular structures from two data sources: Protein Data Bank (PDB) and Electron Microscopy Data Bank (EMDB) in the Eulerian (or Cartesian) representation. Molecular surface (MS) contains non-smooth geometric singularities, such as cusps, tips and selfintersecting facets, which often lead to computational instabilities in molecular simulations, and violate the physical principle of surface free energy minimization. Variational multiscale surface definitions are proposed based on geometric flows and solvation analysis of biomolecular systems. The resulting surfaces are free of geometric singularities and minimize the total free energy of the biomolecular system. High order partial differential equation (PDE)-based nonlinear filters are employed for EMDB data processing. After the construction of protein multiresolution surfaces, we explore the analysis and characterization of surface morphology by the consideration of Gaussian curvature, mean curvature, maximum curvature, minimum curvature, shape index, and curvedness. Based on the curvature and electrostatic analysis from our multiresolution surfaces, we introduce a new concept, the polarized curvature, for the prediction of protein binding sites.                                                                      
  • Protein flexibility and rigidity analysis
FRI Protein structural fluctuation, typically measured by Debye-Waller factors, or B-factors, is a manifestation of protein flexibility, which strongly correlates to protein function. The flexibility-rigidity index (FRI) is a newly proposed method for the construction of atomic rigidity functions required in the theory of continuum elasticity with atomic rigidity, which is a new multiscale formalism for describing excessively large biomolecular systems. The FRI method analyzes protein rigidity and flexibility and is capable of predicting protein B-factors without resorting to matrix diagonalization. A fundamental assumption used in the FRI is that protein structures are uniquely determined by various internal and external interactions, while the protein functions, such as stability and flexibility, are solely determined by the structure. As such, one can predict protein flexibility without resorting to the protein interaction Hamiltonian. Additionally, we propose anisotropic FRI (aFRI) algorithms for the analysis of protein collective dynamics. Eigenvectors obtained from the proposed aFRI algorithms are able to demonstrate collective motions. 

Scientific Computing

  • MIB method for multi-material interface  problem
Multi-material interface problems are omnipresent in science, engineering and daily life. The solution to this class of problems becomes exceptionally challenging when more than two heterogeneous materials join at one point of the space and form a geometric singularityprimary. Based on the MIB method, several schemes have been constructed to solve 2D elliptic equations with discontinuous coefficients associated with three-material interfaces. The essential idea is to smoothly extend functions across the interface and employ the fictitious values at irregular points. For the geometric singularities, two sets of  interface conditions are considered simultaneously. Intensive numerical experiments are carried out to validate the proposed schemes. A second order of accuracy is obtained for complex geometric and geometric singularities.

  • Adaptive mesh based MIB method
Adaptive_mesh Mesh deformation methods break down for elliptic PDEs  interface problems, as additional interface jump conditions are required to maintain the well-posedness of the governing equation.  An interface technique based adaptively deformed mesh strategy is introduced  for resolving elliptic interface problems. We take the advantages of the high accuracy, flexibility and robustness of MIB method to construct an adaptively deformed mesh based interface method. The proposed method generates deformed meshes in the physical domain and solves the transformed governed equations in the computational domain, which maintains regular Cartesian meshes. The mesh deformation is realized by a mesh transformation PDE, which controls the mesh redistribution by a source term. The source term consists of a monitor function, which builds in mesh contraction rules. Both interface geometry based deformed meshes and solution gradient based deformed meshes are constructed to reduce  errors in solving elliptic interface problems. The proposed adaptively deformed mesh based interface method is extensively validated by many numerical experiments. Numerical results indicate that the adaptively deformed mesh based interface method outperforms the original MIB method for dealing with elliptic interface problems.                                                                                                       
  • MIB Galerkin method
MIB_Galerkin A MIB Galerkin formulation is developped for solving the elliptic interface problem. In this approach, we build up two sets of elements respectively on two extended subdomains which both include the interface. As a result, two sets of elements overlap each other near the interface. Fictitious solutions are defined on the overlapping part of the elements, so that the differentiation operations of the original PDEs can be discretized as if there was no interface. The extra coeffients of polynomial basis functions, which furnish the overlapping elements and solve the fictitious solutions, are determined by interface jump conditions. Consequently, the interface jump conditions are rigorously enforced on the interface. The present method utilizes Cartesian meshes to avoid the mesh generation in conventional finite element methods (FEMs). The accuracy, stability and robustness of the proposed 3D MIB Galerkin are extensively validated.  Near second order accuracy has been confirmed. To our knowledge, it is the first time for an FEM to show a near second order convergence in solvingthe Poisson equation with realistic protein surfaces. Additionally, the present work offers the first known near second order accurate method for C_1 continuous or H_2 continuous solutions associated with a Lipschitz continuous interface.