Research areas - Publications - Previous work - Old but still accessible datasets

Research areas

I am a senior lecturer in the Bioinformatics Group and Computational Biology of the Department of Computer Science, Aberystwyth University. My interests include machine learning and data mining, genome and metagenome analysis and microbial, yeast and plant bioinformatics. I am interested in all genomics questions and in particular, in what we can do with computers (algorithms, data structures and artificial intelligence) to help answer these questions. DNA/RNA sequencing allows us to inspect the genetic composition of microbes, animals, plants and viruses, but after we have obtained the sequences, what can we learn? How are communities changing over time? How are enzymes within a community specialised for different roles? Can we find new sources of anti-microbials? How can we detect genes in communities of organisms that have never been cultured? I'm also interested in the ethical and moral implications of genomic technologies, and in how the public will react to this information.


Previous work

Previously I held an RAEng/EPSRC Research Fellowship to "Engineer the Intelligent Scientific Laboratory". This project involved work on the Robot Scientist project, where intelligent software created scientific hypotheses, designed experiments to distinguish between these hypotheses, controlled a lab robot to conduct these experiments, and then uses the results to design the next round of experiments. There were many aspects to the work on this project, including data formalism, experimental protocols, data collection, inference and querying, planning and scheduling, and the practicalities of working in a real lab with real automation equipment.

Before this I held an 1851 Research Fellowship to investigate Grid-enabling lab robots for the Robot Scientist. This was a two year project, Oct 2004 to Sep 2006.

Previously, as a post doc on a BBSRC funded grant, and as a PhD student, I've used machine learning (including ILP) and data mining (particularly multi-relational associations) for functional genomics - elucidating the biological functions of the parts of a genome. When a genome is sequenced, and we have the predicted locations of the genes within the genome, the next stage is to work out the possible functions of these genes. We've been looking at genes in Saccharomyces cerevisiae and Arabidopsis thaliana, the first plant genome to be sequenced. Detailed results for yeast and Arabidopsis are available. This has involved looking at ways to make use of different kinds of data, from microarray data, sequence statistics, homology data, predicted secondary structure, QTLs, and phenotypic data. Also ways to make use of background information, hierarchical information, and also to take into account that proteins have more than one function, a classification problem where each item fits into more than one class.

I've also spent 3 months working with RMIT's Search Engine Group making a multi-relational data mining tool (Radar) based on inverted indexing.

Old but still accessible data sets

Back to Amanda Clare