Research areas - Publications - Previous work - Old but still accessible datasets

Research areas

I am a senior lecturer in the Bioinformatics Group and Computational Biology of the Department of Computer Science, Aberystwyth University. My interests include machine learning and data mining, genome analysis and yeast and plant biology. I'm interested in metagenomics, primer design, functional programming for lab automation, text mining science publications and several other areas.


Previous work

Previously I held an RAEng/EPSRC Research Fellowship to "Engineer the Intelligent Scientific Laboratory". This project involved work on the Robot Scientist project, where intelligent software created scientific hypotheses, designed experiments to distinguish between these hypotheses, controlled a lab robot to conduct these experiments, and then uses the results to design the next round of experiments. There were many aspects to the work on this project, including data formalism, experimental protocols, data collection, inference and querying, planning and scheduling, and the practicalities of working in a real lab with real automation equipment.

Before this I held an 1851 Research Fellowship to investigate Grid-enabling lab robots for the Robot Scientist. This was a two year project, Oct 2004 to Sep 2006.

Previously, as a post doc on a BBSRC funded grant, and as a PhD student, I've used machine learning (including ILP) and data mining (particularly multi-relational associations) for functional genomics - elucidating the biological functions of the parts of a genome. When a genome is sequenced, and we have the predicted locations of the genes within the genome, the next stage is to work out the possible functions of these genes. We've been looking at genes in Saccharomyces cerevisiae and Arabidopsis thaliana, the first plant genome to be sequenced. Detailed results for yeast and Arabidopsis are available. This has involved looking at ways to make use of different kinds of data, from microarray data, sequence statistics, homology data, predicted secondary structure, QTLs, and phenotypic data. Also ways to make use of background information, hierarchical information, and also to take into account that proteins have more than one function, a classification problem where each item fits into more than one class.

I've also spent 3 months working with RMIT's Search Engine Group making a multi-relational data mining tool (Radar) based on inverted indexing.

Old but still accessible data sets

Back to Amanda Clare