Gregory Butler

Dr Greg Butler has over 40 years of research experience across diverse disciplines: algorithm development in computational group theory and combinatorics (1974-1992); the first knowledge-based systems for mathematics (1988-1992); software engineering research into reusable object-oriented frameworks and its application to database technology (1990-2012); bioinformatics (1995-now) covering data management, data modeling, text mining, ontologies, the semantic web, and algorithms.

He leads the development of the bioinformatics platform at the Centre for Structural and Functional Genomics and has been the bioinformatics lead on several large-scale genomics projects covering omics and meta-omics. The genomics projects were vertically integrated covering all stages from sample cultures, extraction of DNA and RNA, sequencing, cDNA microarray & RNA-Seq & proteomics expression profiles, EST and genome assembly, gene prediction, functional annotation, prediction of secreted proteins, expression of proteins, assaying and characterization of proteins.

Dr Butler is the contact person for Big Data at Concordia University, lobbying for better support of research data archives and computational infrastructure for Big Data and High Performance Computing; promoting adoption of open data, open government, and open source software; and bringing together researchers in databases, data mining, data analytics, operations research, supply chain management, computational finance, urban planning, logistics, transport systems, and environmental monitoring.

Presentations

For New Zealand and Australia 2023:

Deep Learning in Protein Sequence Analysis.
Deep learning has had a profound impact on the analysis of images and text. Beneficial AI co-pilots and harmful AI generation of fake news, speech, photos, and videos are of high interest for business, government, and the public.
There is great interest in AI for potential benefits to life sciences, biomedical research, and healthcare. Already a wide range of benefits can be seen. Here we will focus particularly on our research interests in protein sequence analysis and discuss mainly protein language models and transfer learning. Time permitting, we may briefly touch on advances in prediction of protein 3D structure, protein-substrate docking, and generative techniques of synthesis of novel proteins.
1953-1990: My Time at Sydney University,
I will discuss my deep connections with Sydney University and highlight my work in Computational Group Theory with John Cannon from 1973-1990. Yes, it is 50 years ago that I first worked on CGT with John. That changed my life in many ways.
Talk at Computational Algebra and Magma Conference, Sydney Australia, 27-29 November 2023 In Honour of 80th year of John Joseph Cannon

For Malaysia 2019:

The Toot Suite Project: Predicting and classifying membrane and transport protein to study host-microbiome interactions,
The TooT Suite project funded by Genome Canada and Genome Quebec is developing machine learning classifiers for membrane and transport proteins in order to better understand interactions between microbial communities and their host organism. Our EPRSC methodology guides the development of classifiers for protein sequence analysis through consideration of information of certain types that can be derived from the primary protein sequence. The types of information are due to evolution, position, region, sequence, and composition.
This talk will present the Toot-Suite project, the EPRSC methodology, and applications of the methodology.
Big Data: How Will It Change Your Life?
The impact of Big Data in business, science, healthcare, education, and government places increased need for people to understand what is Big Data, what are some consequences of Big Data, and what issues they need to address to avoid potential harm from Big Data.
This talk will tell of my experiences on various groups discussing Big Data and its impact, and what I have learned.
Data-Driven Science and Health: Being F.A.I.R.
The impact of Big Data in science and healthcare places increased need for reproducibility of results so that others trust in their results. The debate on how best to support Open Science for reproducibility has lead to the establishment of the FAIR guidelines requiring all data, software, processes, and analyses to be Findable, Accessible, Interoperable, and Reusable. What does this mean for researchers and Universities?
Our Toot-Suite project funded by Genome Canada and Genome Quebec is required to be FAIR. The project is developing bioinformatics tools to better understand interactions between microbial communities and their host organism. This talk will present the FAIR guidelines, and how we are addressing them in our project.
How I Pick a Research Project
A talk to the Computer Science Journal Club at USM.
I look for: a topic that interests me; a problem that is challenging; a problem that fits with my larger research agenda of how computing leads from data to knowledge, and the application of knowledge; collaborators that I like to work with.

For Sydney 2019:

TooT-T: Discrimination of transport proteins from non-transport proteins

Gregory Butler - Research

Publications

Projects

Presentations