We develop statistical inference methods to identify infection sources in a network. Many practical scenarios can be modeled as an infection spreading from one node to another in a network of interconnected nodes. Examples include the spreading of a contagious disease in a community, the propagation of a virus in a computer network, and the spreading of a rumor among participants in a social network. Identifying the sources of an infection plays an important role in many applications, including finding the index cases that introduce a contagious disease into a population network to facilitate epidemiological studies, identifying the servers that inject a computer virus into a computer network so as to determine the latent points of weaknesses in the network, or to apprehend the individuals who started a malicious rumor in a social network. In this work, we develop methods to identify infection sources and jointly detect the infection spreading. Examples of our research achievements include:
- a method for estimating the number of infection sources and their identities under the SI infection model, which is provably asymptotically correct for the class of geometric trees
- theoretical understanding and proof that the Jordan center estimator is optimal in a universal sense for SI, SIR, SIRI and SIS infection models under certain technical conditions
- optimal strategies for infection spreading and source identification
- an algorithmic framework to estimate number of sources and source identities when each source may start infection spreading at a different time or at a different rate

