- Previous work
- Old but still accessible datasets
I am a senior lecturer in the Bioinformatics Group and Computational Biology of the Department of Computer Science, Aberystwyth University. My interests include machine learning and data mining, genome analysis and yeast and plant biology. I'm interested in metagenomics, primer design, functional programing for lab automation, text mining science publications and several other areas.
- Ravenscroft, J., Liakata, M., Clare, A. and Duma, D. (2017) Measuring Scientific Impact Beyond Academia: An assessment of existing impact metrics and proposed improvements. PLoS One doi:10.1371/journal.pone.0173152, blog post about measuring scientific impact, download the data
- Donkin, E., Dennis, P., Ustalkov, A., Warren, J. and Clare, A. (2017) Replicating complex agent based models, a formidable task. Environmental Modelling and Software, in press.
- Veneman, J.B., Saetnan, E., Clare, A., Newbold, C. (2016). MitiGate; an online meta-analysis database for quantification of mitigation strategies for enteric methane emissions. Science of the Total Environment 572 pp. 1166-1174 doi: 10.1016/j.scitotenv.2016.08.029
- Nicholls, S. M., Clare, A. and Randall J. C. (2016) Goldilocks: a tool for identifying genomic regions that are 'just right'. Bioinformatics 32 (13): 2047-2049, doi: 10.1093/bioinformatics/btw116, blog post about Goldilocks.
- Duma, D., Liakata, M., Clare, A., Ravenscroft, J., Klein, E. (2016) Rhetorical Classification of Anchor Text for Citation Recommendation. WOSP Workshop (5th Intl Workshop on Mining Scientific Publications). Full text of article in D-Lib Magazine.
- Aubrey, W., Riley, M. C., Young, M., King, R. D., Oliver, S. G. and Clare, A. (2015) A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation. PLOS One 10(12): e0142494 doi: 10.1371/journal.pone.0142494, blog post about seamless gene deletion.
- Sapstead, S., Daniel, I. and Clare, A. (2015) Automatically Geotagging Articles in the Welsh Newspapers Online Collection. In proceedings of AI 2015. doi: 10.1007/978-3-319-25032-8_28
- Runciman, C., Clare, A. and Harkness, R. (2014) Laboratory automation in a functional programming language. Journal of Laboratory Automation 2014 Dec; 19(6):569-76. doi: 10.1177/2211068214543373. Blog post describing this article, github code and preprint pdf.
- Riley, M. C., Aubrey, W., Young, M. and Clare, A. (2013) PD5: a general purpose library for primer design software. PLoS One, DOI: 10.1371/journal.pone.0080156. Get the code at the PD5 web site.
- Ravenscroft, J., Liakata, M. and Clare, A. (2013) Partridge: An effective system for the automatic classification of the types of academic papers. In proceedings of AI 2013, Dec 2013. Try out the Partridge system!
- Sparkes, A. and Clare, A. (2012) AutoLabDB: a substantial open source database schema to support a high-throughput automated laboratory. Bioinformatics 28(10) 1390-1397. doi: 10.1093/bioinformatics/bts140 (abstract, pdf).
- Clare, A., Croset, A., Grabmueller, C., Kafkas, S., Liakata, M., Oellrich, A., Rebholz-Schuhmann, D. (2011) Exploring the Generation and Integration of Publishable Scientic Facts Using the Concept of Nano-publications. 1st International Workshop on Semantic Publication (SePublica 2011). pdf.
- Alsberg, B. and Clare, A. (2010) Wiki based management of chemometric research projects. Journal of Chemometrics 24(7-8) p408-417
- Sparkes, A., Aubrey, W., Byrne, E., Clare, A., Khan, M. N., Liakata, M., Markham, M., Rowland, J., Soldatova, L. N., Whelan, K. E., Young, M. and King, R. D. (2010) Towards Robot Scientists for autonomous scientific discovery. Automated Experimentation 2010, 2:1 doi:10.1186/1759-4499-2-1
- Sparkes, A., King, R. D., Aubrey, W., Benway, M., Byrne, E., Clare, A., Liakata, M., Markham, M., Whelan, K. E., Young, M., Rowland, J. (2010) An Integrated Laboratory Robotic System for Autonomous Discovery of Gene Function JALA 15(1) pages 33-40.
- King, R. D., Rowland, J., Aubrey, W., Liakata, M., Markham, M., Soldatova, L. N., Whelan, K. E., Clare, A., Young, M., Sparkes, A., Oliver, S. G., Pir, P. (2009) The Robot Scientist Adam, IEEE Computer, vol. 42, no. 8, pp. 46-54, August, doi:10.1109/MC.2009.270
- King, R. D., Rowland, J., Oliver, S. G., Young, M.,
Aubrey, W., Byrne, E., Liakata, M., Markham, M., Pir, P.,
Soldatova, L. N., Sparkes, A., Whelan, K. E., Clare, A. (2009) The Automation of Science. Science 324(5923):85-89, 3rd April 2009. (preprint pdf, before final corrections)
- Soldatova, L., Aubrey, W., King, R. D. and Clare, A. (2008) The EXACT description of biomedical protocols. Bioinformatics 2008 24: i295-i303. Special issue for ISMB 2008. See also EXACT webpage.
- Riley, M.C., Clare, A. and King, R. D. (2007)
Locational distribution of gene functional classes in Arabidopsis thaliana BMC Bioinformatics 8:112
- Blockeel, H., Schietgat, L., Struyf, J., Dzeroski, S., Clare, A. (2006) Decision Trees for Hierarchical Multilabel Classification: A Case Study in Functional Genomics. In proceedings of PKDD 2006.
- Soldatova, L., Clare, A., Sparkes, A. and King, R. D. (2006) An ontology for a robot scientist.
Bioinformatics 2006 22: 464-471.
Also in ISMB 2006. Archived in CADAIR here.
- Clare, A., Karwath, A., Ougham, H. and King, R. D. (2006) Functional Bioinformatics for Arabidopsis thaliana. Bioinformatics 2006 22: 1130-1136
- Struyf, J., Dzeroski, S. Blockeel, H. and Clare, A. (2005)
Hierarchical Multi-classification with Predictive Clustering Trees in
Functional Genomics. In proceedings of the EPIA 2005 CMB Workshop. Springer link
- Clare, A. (2005) Integration of genomic and phenotypic data. Data Analysis and Visualization in
Genomics and Proteomics, Eds. Francisco Azuaje and Joaquin Dopazo, Wiley, London. ISBN: 0-470-09439-7
- Clare, A., Williams, H. E. and Lester, N. (2004) Scalable multi-relational association mining. In proceedings of the 4th IEEE International Conference on Data Mining (ICDM '04). p355-358. abstract, software
- King, R. D., Wise, P. H. and Clare, A. (2004) Confirmation of Data Mining Based Predictions of Protein Function. Bioinformatics 20(7) 1110-1118, abstract, genepredictions.org
- Clare, A. and King, R. D. (2003) Predicting gene function in Saccharomyces cerevisiae. ECCB 2003 (published as a journal supplement in Bioinformatics 19: ii42-ii49, abstract
- Clare, A. (2003)
Machine learning and data mining for yeast functional genomics. PhD thesis. University of Wales Aberystwyth. pdf (1Mb) This was a runner-up in the 2004 BCS Distinguished Dissertations Award.
- Clare, A. and King R.D. (2003)
Data mining the yeast genome in a lazy functional language. In Practical Aspects of Declarative Languages (PADL'03) (won Best/Most Practical Paper award), abstract, pdf
- Clare, A. and King R.D. (2002)
How well do we understand the clusters found in microarray data? In Silico Biol. 2, 0046, abstract, html, further data
- Clare, A. and King R.D. (2002)
Machine learning of functional class from phenotype data. Bioinformatics 18(1) 160-166. abstract, pdf, further data
- Clare, A. and King R.D. (2001)
Knowledge Discovery in Multi-Label Phenotype Data. In proceedings of ECML/PKDD 2001. abstract, pdf, further data, code
- King, R.D., Karwath, A., Clare, A., & Dehaspe, L. (2001)
The Utility of Different Representations of Protein Sequence for
Predicting Functional Class. Bioinformatics 17(5) 445-454. abstract, pdf, further data
- King, R.D., Karwath, A., Clare, A., & Dehapse, L. (2000)
Accurate prediction of protein functional class in the M. tuberculosis and
E. coli genomes using data mining. Comparative and
Functional Genomics 17 283-293 (nb: volume 1 of CFG was volume 17 of Yeast). actual article, preprint postscript, further data
- King, R.D., Karwath, A., Clare, A., & Dehapse, L. (2000)
prediction of protein functional class from sequence using data
mining. In: The Sixth International Conference on Knowledge Discovery and Data Mining (KDD 2000). pdf, further data
- Rose, T., Elworthy, D., Kotcheff, A., Clare, A., Tsonis, P. (2000) ANVIL: a system for the retrieval of captioned images using NLP techniques. In Challenge of Image Retrieval, Brighton, 2000. gzipped doc
Previously I held an RAEng/EPSRC Research
Fellowship to "Engineer the Intelligent Scientific
Laboratory". This project involved work on the Robot
Scientist project, where intelligent software
created scientific hypotheses, designed experiments to distinguish
between these hypotheses, controlled a lab robot to conduct these
experiments, and then uses the results to design the next round of
experiments. There were many aspects to the work on this project,
including data formalism, experimental protocols, data collection,
inference and querying, planning and scheduling, and the
practicalities of working in a real lab with real automation
Before this I held an 1851 Research
Fellowship to investigate Grid-enabling lab robots for the Robot
Scientist. This was a two year project, Oct 2004 to Sep 2006.
Previously, as a post doc on a BBSRC funded grant, and as a PhD student, I've used
machine learning (including ILP) and data mining (particularly
multi-relational associations) for functional genomics - elucidating the
biological functions of the parts of a genome. When a genome is
sequenced, and we have the predicted locations of the genes within the
genome, the next stage is to work out the possible functions of these
genes. We've been looking at genes in Saccharomyces cerevisiae and Arabidopsis thaliana, the first
plant genome to be sequenced.
for yeast and Arabidopsis are available.
This has involved looking at ways to make use of different kinds of data, from
microarray data, sequence statistics, homology data, predicted
secondary structure, QTLs, and phenotypic data. Also ways
to make use of background information, hierarchical information, and
also to take into account that proteins have more than one function, a
classification problem where each item fits into more than one class.
I've also spent 3 months working with RMIT's Search Engine Group making a
multi-relational data mining tool (Radar) based on inverted indexing.
Old but still accessible data sets
Back to Amanda Clare