A SYSTEMS LEVEL BASED MODEL FOR IDENTIFYING POTENTIAL TARGETS ASSOCIATED WITH INFLUENZA A INFECTION

ABSTRACT
Developing therapeutics for infectious diseases requires understanding the main processes driving host and pathogen through which molecular interactions influence cellular functions. The outcome of those infectious diseases, including influenza A (IAV) depends greatly on how the host responds to the virus and how the virus manipulates the host, which is facilitated by protein-protein functional inter-actions and analyzing infection associated genes at the systems level, which may enable us to characterize specific molecular mechanisms which allow the virus of influenza A strains H1N1 and H3N2 to persist and survive inside the host. The system level analysis based on experimental and computational approaches was used to predict human protein-protein functional inter-actions. This human protein-protein functional interaction is a graph consisting of nodes which are proteins, and links joining them. Using this graph, we analyse topological properties of this human protein-protein functional interactions, identify candidate proteins using centrality measures and a map set of IAV infection associated proteins to elucidate genes related to IAV infection and identify essential dense sub-graphs underlying IAV infection outcome. We performed functional closeness and enrichment analyses to identify statistically and biologically significant processes and pathways implicated in IAV infection. These IAV infection associated proteins have shown to be relevant for further research towards new drugs and vaccine development. This study enhances our understanding on the interplay between influenza A and its host and may contribute to the process of designing novel drugs.


CHAPTER ONE 
INTRODUCTION
This thesis presents a system level analysis for the identification of potential targets of Influenza A disease. In this chapter, the background of the study is presented, providing an overview of the whole work. We formulate the problem and discuss approaches that will be used to tackle it, highlighting advantages of these approaches. In summary, this chapter provides a global view of the other chapters.

Background to the Study
Influenza A is a viral disease that can be found in humans and other mammals. According to Smith, (2009), a new influenza A virus which originated from swine surfaced in Mexico and United States in March and early April in the year 2009. In Smith et al. (2009), it is shown that influenza A was derived from several viruses circulating in swine, which lead to the first transmission to humans. The virus had the potential to become first influenza pandemic of the 21st century observed in the year 2009 (Smith, 2009). During the outbreak of the influenza, the first week of surveillance revealed the spread of the virus in over 30 countries through transmission from human to human. This lead the World Health Organization to increase the pandemic alert to level 5 of 6. The different pandemic levels, as described by the World Health Organization (WHO) in the year 2009 are shown in Table 1.

The results from the research by Smith, (2009) stated that there was a need for systematic surveillance of the swine influenza and provided evidence that the mixing of new genetic elements in swine has the ability to cause the emergence of viruses with pandemic potential in humans. In medicine, the detection of diseases, the treatment and prevention of many diseases have improved tremendously. This is, in part, due to an improved understanding of biological systems and different factors that trigger progression to diseases.

This has been influenced by advances in high-throughput biological experiments able to generate genome scale datasets of biological cells, including protein sequences, protein-protein interactions, gene expression, regulation and other functional datasets. This has enabled a paradigm shift from single gene analysis to systems level analysis, providing a global view of systems' behavior. This requires the use of systematic mathematical models in order to deal with this large volume of datasets for effective biological knowledge discovery. The system level analysis based on protein-protein interactions has been vital to understanding how proteins function within the cell (Deng and Sheng, 2007). Understanding protein interactions in a given cellular proteome, sometimes known as the interactome, is important to the analysis of the cell biochemistry (Deng and Sheng, 2007). A comprehensive collection of information related to human proteins, their features, and their functions is required to ensure information retrieval and possible biological knowledge discovery. For an effective biological knowledge discovery, there is a need to better understand functional activities of proteins in cells and the exact sub-cellular localization of proteins and their tissue-specific distribution. In addition, the knowledge on proteins encoding disease-associated genes play their roles in molecular complexes and biological pathways is very important (Deng and Sheng, 2007). Some facts about influenza A documented by the Center for Disease Control (CDC, 2009) are as follows:

Influenza A is a respiratory disease of pigs caused by type A influenza virus that regularly causes outbreaks of influenza in pigs. Swine influenza viruses may circulate among swine throughout the year, but most outbreaks occur during the late fall and winter months similar to outbreaks in humans.

The classical influenza type A H1N1 virus was first isolated from a pig in 1930. Over the years, different variations of swine flu viruses have emerged. At this time, there are four main Influenza A virus subtypes that have been isolated in pigs: H1N1, H1N2, H3N2, and H3N1. However, most of the recently isolated influenza viruses from pigs have been H1N1 viruses.

Influenza A viruses do not normally infect humans. However, sporadic human infections with swine flu have occurred. Most commonly, these cases occur in persons with direct exposure to pigs, for example, children near pigs at a fair or workers in the swine industry. In the past, CDC received reports of approximately one human swine influenza virus infection every one to two years in the U.S., but from December 2005 through February 2009, 12 cases of human infection with swine influenza was reported. The symptoms of Influenza A flu in people are expected to be similar to the symptoms of regular human seasonal influenza and include fever, lethargy, lack of appetite and coughing. Some people with swine flu also have reported runny nose, sore throat, nausea, vomiting and diarrhoea.

Influenza viruses can be directly transmitted from pigs to humans and from humans to pigs. Human infection with flu viruses from pigs is most likely to occur when people are in close proximity to infected pigs, such as in pig barns and livestock exhibits housing pigs at fairs. Human-to-human transmission of swine flu can also occur. This is thought to occur in the same way as seasonal flu occurs in humans, which is mainly person-to-person transmission through coughing or sneezing of people infected with the influenza virus. Humans may become infected by touching something with flu viruses on it and then touching their mouth or nose. The H1N1 swine flu viruses are antigenically very different from human H1N1 viruses and, therefore, vaccines for human seasonal flu would not provide protection from H1N1 swine flu viruses. In this project, we use systems level computational approaches to identify potential targets of influenza A H1N1 and H3N2. We used a graph-based model to elucidate relationships between different targets identified. Moreover, we performed biological and pathways enrichment analyses. These will produce sub network enriched process and pathways that may play initial role in influenza pathogenesis. In this study, different datasets are derived from different sources to build the protein-protein functional network and perform further analyses. For generating different scores, among the many methods is the application of Information theory, which is a branch of applied mathematics, electrical engineering and computer science and involves the quantification of information. According to Rieke and Warland (1997), information theory was developed by Claude E. Shannon.

In information theory, a candidate measure is entropy, which quantifies the uncertainty, which is involved in predicting the value of a random variable. This measure is used at a point in scoring of sequence data. Information theory is based on probability theory and statistics. According to Reza (1961), entropy is an important quantity of information and it is common to have a measure between two random variables. A property of entropy is that; it maximizes with a uniform distribution. The entropy H of a random variable Y is associated with measuring intuitively the amount of uncertainty of Y when only the distribution of Y is known Reza (1961). In the same way, entropy is used in this research work to maximize the information content of this work.

Other methods used involve an application of Network theory. In network science and computer science, network theory is the study of graphs as a representation of asymmetric relationships between discrete objects in a general sense. Network theory is also part of Graph theory Newman (2003a). Network theory can be applied in many different areas of study. Some of these areas are statistical physics, computer science, electrical engineering, operations research, gene regulatory networks, and so on. The first true proof in network theory is Euler's solution of the Seven Bridges of Königsberg problem Newman (2003a) which was to devise a walk through the city that by crossing each bridge once, and the starting and ending points of the walk do not need to be the same Newman (2003). There are different types of networks that can be analysed. The social network for instance examines the structure of relationships between social bodies or entities. Persons, groups, organizations, Nation states, websites can be considered as entities in this case. Social network analysis has over the years played a major role in social science. It has been used to analyse several phenomena, including the spread of diseases, the study of markets and many others Wasserman (1994).

In Biological network, the analysis of molecular networks has become central; this is due to the public availability of high throughput biological data, especially protein-protein interaction and other functional datasets. The type of analysis here is almost the same as that of social network analysis, but it focuses on the local patterns in the network. The analysis of biological networks in relation to diseases led to the development of network medicine as another area of application Barabási and Gulbache (2011). Centrality measures which are mostly used in network theory are used in this study to analyze a human-human protein network which is generated.

The knowledge of clustering is also required in this research work. Cluster analysis or clustering consists of grouping objects such that objects in the same group (cluster) are more similar to each other than objects in another group or cluster Bailey (1994). The use of clustering is common in data mining, statistical data analysis, bioinformatics, pattern recognition, image analysis and so on Bailey (1994). Clustering has no specific algorithm that can be used. Clustering can be done by different algorithms both in notion and in how to efficiently find clusters. Some of the algorithms are, the agglomerative algorithm that merge similar nodes recursively, the and divisive algorithm which detects inter-community links and remove them from the network. These methods do not produce a unique partitioning of the data set; they however produce a hierarchy from which the user still needs to choose appropriate clusters. They are not very robust towards outliers, which will either show up as additional clusters or even cause other clusters to merge, hence they are too slow for large datasets. There is however another algorithm introduced by Blondel and Guillaume (2008) which is what we use in this work, reasons being that it is fast and can produce quick results for a large network unlike the agglomerative, and divisive algorithms.

For more Mathematics & Statistics Projects Click here
===================================================================
Item Type: Ghanaian Postgraduate Material  |  Attribute: 141 pages  |  Chapters: 1-5
Format: MS Word  |  Price: GH110 ($20)  |  Delivery: Within 30Mins.
===================================================================

Share:

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Search for your topic here

See full list of Project Topics under your Department Here!

Featured Post

HOW TO WRITE A RESEARCH HYPOTHESIS

A hypothesis is a description of a pattern in nature or an explanation about some real-world phenomenon that can be tested through observ...

Popular Posts