He leads the development of the bioinformatics platform at the Centre for Structural and Functional Genomics and has been the bioinformatics lead on several large-scale genomics projects covering omics and meta-omics. The genomics projects were vertically integrated covering all stages from sample cultures, extraction of DNA and RNA, sequencing, cDNA microarray & RNA-Seq & proteomics expression profiles, EST and genome assembly, gene prediction, functional annotation, prediction of secreted proteins, expression of proteins, assaying and characterization of proteins.
Dr Butler is the contact person for Big Data at Concordia University, lobbying for better support of research data archives and computational infrastructure for Big Data and High Performance Computing; promoting adoption of open data, open government, and open source software; and bringing together researchers in databases, data mining, data analytics, operations research, supply chain management, computational finance, urban planning, logistics, transport systems, and environmental monitoring.
For coverage of my publications across bioinformatics, databases, software, and computer algebra, see
I used to have two DBLP entries under Greg Butler and Gregory Butler.
Current research projects:
The Toot Suite project is developing tools for predicting and classifying transport proteins in host organisms and microbiomes. These tools will provide scientists with insight into host-microbiome interactions. For agricultural scientists, the host-microbiome interactions are important for plant and animal health and growth, which are key factors in food productivity.
Gregory Butler and Tristan Glatard, 2018-2021. A Genome Canada 2017 Bioinformatics and Computational Biology Competition Project
For New Zealand and Australia 2023:
Deep learning has had a profound impact on the analysis of images and text. Beneficial AI co-pilots and harmful AI generation of fake news, speech, photos, and videos are of high interest for business, government, and the public.
There is great interest in AI for potential benefits to life sciences, biomedical research, and healthcare. Already a wide range of benefits can be seen. Here we will focus particularly on our research interests in protein sequence analysis and discuss mainly protein language models and transfer learning. Time permitting, we may briefly touch on advances in prediction of protein 3D structure, protein-substrate docking, and generative techniques of synthesis of novel proteins.
I will discuss my deep connections with Sydney University and highlight my work in Computational Group Theory with John Cannon from 1973-1990. Yes, it is 50 years ago that I first worked on CGT with John. That changed my life in many ways.
Talk at Computational Algebra and Magma Conference, Sydney Australia, 27-29 November 2023 In Honour of 80th year of John Joseph Cannon
For Malaysia 2019:
The TooT Suite project funded by Genome Canada and Genome Quebec is developing machine learning classifiers for membrane and transport proteins in order to better understand interactions between microbial communities and their host organism. Our EPRSC methodology guides the development of classifiers for protein sequence analysis through consideration of information of certain types that can be derived from the primary protein sequence. The types of information are due to evolution, position, region, sequence, and composition.
This talk will present the Toot-Suite project, the EPRSC methodology, and applications of the methodology.
The impact of Big Data in business, science, healthcare, education, and government places increased need for people to understand what is Big Data, what are some consequences of Big Data, and what issues they need to address to avoid potential harm from Big Data.
This talk will tell of my experiences on various groups discussing Big Data and its impact, and what I have learned.
The impact of Big Data in science and healthcare places increased need for reproducibility of results so that others trust in their results. The debate on how best to support Open Science for reproducibility has lead to the establishment of the FAIR guidelines requiring all data, software, processes, and analyses to be Findable, Accessible, Interoperable, and Reusable. What does this mean for researchers and Universities?
Our Toot-Suite project funded by Genome Canada and Genome Quebec is required to be FAIR. The project is developing bioinformatics tools to better understand interactions between microbial communities and their host organism. This talk will present the FAIR guidelines, and how we are addressing them in our project.
A talk to the Computer Science Journal Club at USM.
I look for: a topic that interests me; a problem that is challenging; a problem that fits with my larger research agenda of how computing leads from data to knowledge, and the application of knowledge; collaborators that I like to work with.
For Sydney 2019: