Publications by Stephen D Pickett | LitMetric

Publications by authors named "Stephen D Pickett"

Page 1 of 2

Benchmarking 3D Structure-Based Molecule Generators.

Natasha Sanjrani , Damien E Coupry , Peter Pogány , David S Palmer , Stephen D Pickett

J Chem Inf Model

August 2025

To understand the benefits and drawbacks of 3D combinatorial and deep learning generators, a novel benchmark was created focusing on the recreation of important protein-ligand interactions and 3D ligand conformations. Using the BindingMOAD data set with a hold-out blind set, the sequential graph neural network generators, Pocket2Mol and PocketFlow, diffusion models, DiffSBDD and MolSnapper, and combinatorial genetic algorithms, AutoGrow4 and LigBuilderV3, were evaluated. It was discovered that deep learning methods fail to generate structurally valid molecules and 3D conformations, whereas combinatorial methods are slow and generate molecules that are prone to failing 2D MOSES filters.

View Article and Find Full Text PDF

Exploring BERT for Reaction Yield Prediction: Evaluating the Impact of Tokenization, Molecular Representation, and Pretraining Data Augmentation.

Adrian Krzyzanowski , Stephen D Pickett , Peter Pogány

J Chem Inf Model

May 2025

Predicting reaction yields in synthetic chemistry remains a significant challenge. This study systematically evaluates the impact of tokenization, molecular representation, pretraining data, and adversarial training on a BERT-based model for yield prediction of Buchwald-Hartwig and Suzuki-Miyaura coupling reactions using publicly available HTE data sets. We demonstrate that molecular representation choice (SMILES, DeepSMILES, SELFIES, Morgan fingerprint-based notation, IUPAC names) has minimal impact on model performance, while typically BPE and SentencePiece tokenization outperform other methods.

View Article and Find Full Text PDF

Visualising lead optimisation series using reduced graphs.

Jessica Stacey , Baptiste Canault , Stephen D Pickett , Valerie J Gillet

J Cheminform

April 2025

The typical way in which lead optimisation (LO) series are represented in the medicinal chemistry literature is as Markush structures and associated R-group tables. The Markush structure shows a central core or molecular scaffold that is common to the series with R groups that indicate the points of variability that have been explored in the series. The associated R-group table shows the substituent combinations that exist in individual molecules in the series together with properties of those compounds.

View Article and Find Full Text PDF

bbSelect - An Open-Source Tool for Performing a 3D Pharmacophore-Driven Diverse Selection of R-Groups.

Francesco Rianjongdee , David Palmer , Stephen D Pickett , Peter Pogány , Nicholas C O Tomkinson

J Chem Inf Model

June 2024

The design of compounds during hit-to-lead often seeks to explore a vector from a core scaffold to form additional interactions with the target protein. A rational approach to this is to probe the region of a protein accessed by a vector with a systematic placement of pharmacophore features in 3D, particularly when bound structures are not available. Herein, we present bbSelect, an open-source tool built to map the placements of pharmacophore features in 3D Euclidean space from a library of R-groups, employing partitioning to drive a diverse and systematic selection to a user-defined size.

View Article and Find Full Text PDF

Blinded Predictions and Post Hoc Analysis of the Second Solubility Challenge Data: Exploring Training Data and Feature Set Selection for Machine and Deep Learning Models.

Jonathan G M Conn , James W Carter , Justin J A Conn , Vigneshwari Subramanian , Andrew Baxter , Stephen D Pickett

J Chem Inf Model

February 2023

Accurate methods to predict solubility from molecular structure are highly sought after in the chemical sciences. To assess the state of the art, the American Chemical Society organized a "Second Solubility Challenge" in 2019, in which competitors were invited to submit blinded predictions of the solubilities of 132 drug-like molecules. In the first part of this article, we describe the development of two models that were submitted to the Blind Challenge in 2019 but which have not previously been reported.

View Article and Find Full Text PDF

Alchemical Free Energy Methods Applied to Complexes of the First Bromodomain of BRD4.

Ellen E Guest , Luis F Cervantes , Stephen D Pickett , Charles L Brooks , Jonathan D Hirst

J Chem Inf Model

March 2022

Accurate and rapid predictions of the binding affinity of a compound to a target are one of the ultimate goals of computer aided drug design. Alchemical approaches to free energy estimations follow the path from an initial state of the system to the final state through alchemical changes of the energy function during a molecular dynamics simulation. Herein, we explore the accuracy and efficiency of two such techniques: relative free energy perturbation (FEP) and multisite lambda dynamics (MSλD).

View Article and Find Full Text PDF

Structural variation of protein-ligand complexes of the first bromodomain of BRD4.

Ellen E Guest , Stephen D Pickett , Jonathan D Hirst

Org Biomol Chem

June 2021

The bromodomain-containing protein 4 (BRD4), a member of the bromodomain and extra-terminal domain (BET) family, plays a key role in several diseases, especially cancers. With increased interest in BRD4 as a therapeutic target, many X-ray crystal structures of the protein in complex with small molecule inhibitors are publicly available over the recent decade. In this study, we use this structural information to investigate the conformations of the first bromodomain (BD1) of BRD4.

View Article and Find Full Text PDF

A Turing Test for Molecular Generators.

Jacob T Bush , Peter Pogany , Stephen D Pickett , Mike Barker , Andrew Baxter

J Med Chem

October 2020

Machine learning approaches promise to accelerate and improve success rates in medicinal chemistry programs by more effectively leveraging available data to guide a molecular design. A key step of an automated computational design algorithm is molecule generation, where the machine is required to design high-quality, drug-like molecules within the appropriate chemical space. Many algorithms have been proposed for molecular generation; however, a challenge is how to assess the validity of the resulting molecules.

View Article and Find Full Text PDF

Guidelines for Recurrent Neural Network Transfer Learning-Based Molecular Generation of Focused Libraries.

Silvia Amabilino , Peter Pogány , Stephen D Pickett , Darren V S Green

J Chem Inf Model

December 2020

Deep learning approaches have become popular in recent years in the field of molecular design. While a variety of different methods are available, it is still a challenge to assess and compare their performance. A particularly promising approach for automated drug design is to use recurrent neural networks (RNNs) as SMILES generators and train them with the learning procedure called "transfer learning".

View Article and Find Full Text PDF

De Novo Molecule Design by Translating from Reduced Graphs to SMILES.

Peter Pogány , Navot Arad , Sam Genway , Stephen D Pickett

J Chem Inf Model

March 2019

A key component of automated molecular design is the generation of compound ideas for subsequent filtering and assessment. Recently deep learning approaches have been explored as alternatives to traditional de novo molecular design techniques. Deep learning algorithms rely on learning from large pools of molecules represented as molecular graphs (generally SMILES), and several approaches can be used to tailor the generated molecules to defined regions of chemical space.

View Article and Find Full Text PDF

Nuisance Compounds, PAINS Filters, and Dark Chemical Matter in the GSK HTS Collection.

Subhas J Chakravorty , James Chan , Marie Nicole Greenwood , Ioana Popa-Burke , Katja S Remlinger , Stephen D Pickett

SLAS Discov

July 2018

High-throughput screening (HTS) hits include compounds with undesirable properties. Many filters have been described to identify such hits. Notably, pan-assay interference compounds (PAINS) has been adopted by the community as the standard term to refer to such filters, and very useful guidelines have been adopted by the American Chemical Society (ACS) and subsequently triggered a healthy scientific debate about the pitfalls of draconian use of filters.

View Article and Find Full Text PDF

Design Principles for Fragment Libraries: Maximizing the Value of Learnings from Pharma Fragment-Based Drug Discovery (FBDD) Programs for Use in Academia.

György M Keserű , Daniel A Erlanson , György G Ferenczy , Michael M Hann , Christopher W Murray , Stephen D Pickett

J Med Chem

September 2016

Fragment-based drug discovery (FBDD) is well suited for discovering both drug leads and chemical probes of protein function; it can cover broad swaths of chemical space and allows the use of creative chemistry. FBDD is widely implemented for lead discovery in industry but is sometimes used less systematically in academia. Design principles and implementation approaches for fragment libraries are continually evolving, and the lack of up-to-date guidance may prevent more effective application of FBDD in academia.

View Article and Find Full Text PDF

Structurally Diverse Mitochondrial Branched Chain Aminotransferase (BCATm) Leads with Varying Binding Modes Identified by Fragment Screening.

Jennifer A Borthwick , Nicolas Ancellin , Sophie M Bertrand , Ryan P Bingham , Paul S Carter , Stephen D Pickett

J Med Chem

March 2016

Inhibitors of mitochondrial branched chain aminotransferase (BCATm), identified using fragment screening, are described. This was carried out using a combination of STD-NMR, thermal melt (Tm), and biochemical assays to identify compounds that bound to BCATm, which were subsequently progressed to X-ray crystallography, where a number of exemplars showed significant diversity in their binding modes. The hits identified were supplemented by searching and screening of additional analogues, which enabled the gathering of further X-ray data where the original hits had not produced liganded structures.

View Article and Find Full Text PDF

An analysis of the attrition of drug candidates from four major pharmaceutical companies.

Michael J Waring , John Arrowsmith , Andrew R Leach , Paul D Leeson , Sam Mandrell , Stephen D Pickett

Nat Rev Drug Discov

July 2015

The pharmaceutical industry remains under huge pressure to address the high attrition rates in drug development. Attempts to reduce the number of efficacy- and safety-related failures by analysing possible links to the physicochemical properties of small-molecule drug candidates have been inconclusive because of the limited size of data sets from individual companies. Here, we describe the compilation and analysis of combined data on the attrition of drug candidates from AstraZeneca, Eli Lilly and Company, GlaxoSmithKline and Pfizer.

View Article and Find Full Text PDF

The Discovery of in Vivo Active Mitochondrial Branched-Chain Aminotransferase (BCATm) Inhibitors by Hybridizing Fragment and HTS Hits.

Sophie M Bertrand , Nicolas Ancellin , Benjamin Beaufils , Ryan P Bingham , Jennifer A Borthwick , Stephen D Pickett

J Med Chem

September 2015

The hybridization of hits, identified by complementary fragment and high throughput screens, enabled the discovery of the first series of potent inhibitors of mitochondrial branched-chain aminotransferase (BCATm) based on a 2-benzylamino-pyrazolo[1,5-a]pyrimidinone-3-carbonitrile template. Structure-guided growth enabled rapid optimization of potency with maintenance of ligand efficiency, while the focus on physicochemical properties delivered compounds with excellent pharmacokinetic exposure that enabled a proof of concept experiment in mice. Oral administration of 2-((4-chloro-2,6-difluorobenzyl)amino)-7-oxo-5-propyl-4,7-dihydropyrazolo[1,5-a]pyrimidine-3-carbonitrile 61 significantly raised the circulating levels of the branched-chain amino acids leucine, isoleucine, and valine in this acute study.

View Article and Find Full Text PDF

QSAR workbench: automating QSAR modeling to drive compound design.

Richard Cox , Darren V S Green , Christopher N Luscombe , Noj Malcolm , Stephen D Pickett

J Comput Aided Mol Des

April 2013

We describe the QSAR Workbench, a system for the building and analysis of QSAR models. The system is built around the Pipeline Pilot workflow tool and provides access to a variety of model building algorithms for both continuous and categorical data. Traditionally models are built on a one by one basis and fully exploring the model space of algorithms and descriptor subsets is a time consuming basis.

View Article and Find Full Text PDF

Automated Lead Optimization of MMP-12 Inhibitors Using a Genetic Algorithm.

Stephen D Pickett , Darren V S Green , David L Hunt , David A Pardoe , Ian Hughes

ACS Med Chem Lett

January 2011

Traditional lead optimization projects involve long synthesis and testing cycles, favoring extensive structure-activity relationship (SAR) analysis and molecular design steps, in an attempt to limit the number of cycles that a project must run to optimize a development candidate. Microfluidic-based chemistry and biology platforms, with cycle times of minutes rather than weeks, lend themselves to unattended autonomous operation. The bottleneck in the lead optimization process is therefore shifted from synthesis or test to SAR analysis and design.

View Article and Find Full Text PDF

The impact of aromatic ring count on compound developability: further insights by examining carbo- and hetero-aromatic and -aliphatic ring types.

Timothy J Ritchie , Simon J F Macdonald , Robert J Young , Stephen D Pickett

Drug Discov Today

February 2011

The impact of carboaromatic, heteroaromatic, carboaliphatic and heteroaliphatic ring counts and fused aromatic ring count on several developability measures (solubility, lipophilicity, protein binding, P450 inhibition and hERG binding) is the topic for this review article. Recent results indicate that increasing ring counts have detrimental effects on developability in the order carboaromatics≫heteroaromatics>carboaliphatics>heteroaliphatics, with heteroaliphatics exerting a beneficial effect in many cases. Increasing aromatic ring count exerts effects on several developability parameters that are lipophilicity- and size-independent, and fused aromatic systems have a beneficial effect relative to their nonfused counterparts.

View Article and Find Full Text PDF

Lead optimization using matched molecular pairs: inclusion of contextual information for enhanced prediction of HERG inhibition, solubility, and lipophilicity.

George Papadatos , Muhammad Alkarouri , Valerie J Gillet , Peter Willett , Visakan Kadirkamanathan , Stephen D Pickett

J Chem Inf Model

October 2010

Previous studies of the analysis of molecular matched pairs (MMPs) have often assumed that the effect of a substructural transformation on a molecular property is independent of the context (i.e., the local structural environment in which that transformation occurs).

View Article and Find Full Text PDF

Analysis of neighborhood behavior in lead optimization and array design.

George Papadatos , Anthony W J Cooper , Visakan Kadirkamanathan , Simon J F Macdonald , Iain M McLay , Stephen D Pickett

J Chem Inf Model

February 2009

Neighborhood behavior describes the extent to which small structural changes defined by a molecular descriptor are likely to lead to small property changes. This study evaluates two methods for the quantification of neighborhood behavior: the optimal diagonal method of Patterson et al. and the optimality criterion method of Horvath and Jeandenans.

View Article and Find Full Text PDF

Evolving interpretable structure-activity relationship models. 2. Using multiobjective optimization to derive multiple models.

Kristian Birchall , Valerie J Gillet , Gavin Harper , Stephen D Pickett

J Chem Inf Model

August 2008

A multiobjective evolutionary algorithm (MOEA) is described for evolving multiple structure-activity relationships (SARs). The SARs are encoded in easy-to-interpret reduced graph queries which describe features that are preferentially present in active compounds compared to inactives. The MOEA addresses a limitation associated with many machine learning methods; that is, the inherent tradeoff that exists in recall and precision which is usually handled by combining the two objectives into a single measure with a consequent loss of control.

View Article and Find Full Text PDF

Evolving interpretable structure-activity relationships. 1. Reduced graph queries.

Kristian Birchall , Valerie J Gillet , Gavin Harper , Stephen D Pickett

J Chem Inf Model

August 2008

A new machine learning method is presented for extracting interpretable structure-activity relationships from screening data. The method is based on an evolutionary algorithm and reduced graphs and aims to evolve a reduced graph query (subgraph) that is present within the active compounds and absent from the inactives. The reduced graph representation enables heterogeneous compounds, such as those found in high-throughput screening data, to be captured in a single representation with the resulting query encoding structure-activity information in a form that is readily interpretable by a chemist.

View Article and Find Full Text PDF

Contemporary QSAR classifiers compared.

Craig L Bruce , James L Melville , Stephen D Pickett , Jonathan D Hirst

J Chem Inf Model

May 2007

We present a comparative assessment of several state-of-the-art machine learning tools for mining drug data, including support vector machines (SVMs) and the ensemble decision tree methods boosting, bagging, and random forest, using eight data sets and two sets of descriptors. We demonstrate, by rigorous multiple comparison statistical tests, that these techniques can provide consistent improvements in predictive performance over single decision trees. However, within these methods, there is no clearly best-performing algorithm.

View Article and Find Full Text PDF

Methods for mining HTS data.

Gavin Harper , Stephen D Pickett

Drug Discov Today

August 2006

Data mining is a fast-growing field that is finding application across a wide range of industries. HTS is a crucial part of the drug discovery process at most large pharmaceutical companies. Accurate analysis of HTS data is, therefore, vital to drug discovery.

View Article and Find Full Text PDF

Training similarity measures for specific activities: application to reduced graphs.

Kristian Birchall , Valerie J Gillet , Gavin Harper , Stephen D Pickett

J Chem Inf Model

September 2006

Reduced graph representations of chemical structures have been shown to be effective in similarity searching applications where they offer comparable performance to other 2D descriptors in terms of recall experiments. They have also been shown to complement existing descriptors and to offer potential to scaffold hop from one chemical series to another. Various methods have been developed for quantifying the similarity between reduced graphs including fingerprint approaches, graph matching, and an edit distance method.

View Article and Find Full Text PDF