Publications by authors named "Peter D Wentzell"

Detergent-based workflows incorporating sodium dodecyl sulfate (SDS) necessitate additional steps for detergent removal ahead of mass spectrometry (MS). These steps may lead to variable protein recovery, inconsistent enzyme digestion efficiency, and unreliable MS signals. To validate a detergent-based workflow for quantitative proteomics, we herein evaluate the precision of a bottom-up sample preparation strategy incorporating cartridge-based protein precipitation with organic solvent to deplete SDS.

View Article and Find Full Text PDF

To address the growing concern of honey adulteration in Canada and globally, a quantitative NMR method was developed to analyze 424 honey samples collected across Canada as part of two surveys in 2018 and 2019 led by the Canadian Food Inspection Agency. Based on a robust and reproducible methodology, NMR data were recorded in triplicate on a 700 MHz NMR spectrometer equipped with a cryoprobe, and the data analysis led to the identification and quantification of 33 compounds characteristic of the chemical composition of honey. The high proportion of Canadian honey in the library provided a unique opportunity to apply multivariate statistical methods including PCA, PLS-DA, and SIMCA in order to differentiate Canadian samples from the rest of the world.

View Article and Find Full Text PDF

A two steps proposal for the purification of immunoglobulin G from human blood plasma is investigated. The first step is precipitation using cold ethanol based on the Cohn method with some modification and the second step is a chromatographic separation by DEAE-Sepharose FF resin as a weak anion exchanger. The presence of interferent in the region3 of chromatographic fractions, which is co-eluted with IgG, restricts the application of the mechanistic chromatography model.

View Article and Find Full Text PDF

Multivariate data analysis tools have become an integral part of modern analytical chemistry, and principal component analysis (PCA) is perhaps foremost among these. PCA is central in approaching many problems in data exploration, classification, calibration, modelling, and curve resolution. However, PCA is only one form of a broader group of factor analysis (FA) methods that are rarely employed by chemists.

View Article and Find Full Text PDF
Article Synopsis
  • kPPA is an advanced data visualization technique that helps analyze multivariate data, particularly effective for binary datasets, and provides better class separation than traditional methods like PCA.
  • When dealing with multiple classifications, kPPA may not yield the most relevant projections, but its optimization algorithm allows for exploration of various local minima to find better visualizations.
  • The new method, CombPPA, uses Procrustes rotation to explore different projection combinations and presents the application of this method on grape juice samples, showcasing its ability to reveal desired class separations and improved kPPA solutions.
View Article and Find Full Text PDF

Sparse projection pursuit analysis (SPPA), a new approach for the unsupervised exploration of high-dimensional chemical data, is proposed as an alternative to traditional exploratory methods such as principal components analysis (PCA) and hierarchical cluster analysis (HCA). Where traditional methods use variance and distance metrics for data compression and visualization, the proposed method incorporates the fourth statistical moment (kurtosis) to access interesting subspaces that can clarify relationships within complex data sets. The quasi-power algorithm used for projection pursuit is coupled with a genetic algorithm for variable selection to efficiently generate sparse projection vectors that improve the chemical interpretability of the results while at the same time mitigating the problem of overmodeling.

View Article and Find Full Text PDF

One of the greatest challenges facing the functional food and natural health product (NHP) industries is sourcing high-quality, functional, natural ingredients for their finished products. Unfortunately, the lack of ingredient standards, modernized analytical methodologies, and industry oversight creates the potential for low quality and, in some cases, deliberate adulteration of ingredients. By exploring a diverse library of NHPs provided by the independent certification organization ISURA, we demonstrated that nuclear magnetic resonance (NMR) spectroscopy provides an innovative solution to authenticate botanicals and warrant the quality and safety of processed foods and manufactured functional ingredients.

View Article and Find Full Text PDF

The error covariance matrix (ECM) is an important tool for characterizing the errors from multivariate measurements, representing both the variance and covariance in the errors across multiple channels. Such information is useful in understanding and minimizing sources of experimental error and in the selection of optimal data analysis procedures. Experimental ECMs, normally obtained through replication, are inherently noisy, inconvenient to obtain, and offer limited interpretability.

View Article and Find Full Text PDF

Most of the current expressions used to calculate figures of merit in multivariate calibration have been derived assuming independent and identically distributed (iid) measurement errors. However, it is well known that this condition is not always valid for real data sets, where the existence of many external factors can lead to correlated and/or heteroscedastic noise structures. In this report, the influence of the deviations from the classical iid paradigm is analyzed in the context of error propagation theory.

View Article and Find Full Text PDF

Projection pursuit (PP) is an effective exploratory data analysis tool because it optimizes the projection of high dimensional data using distributional characteristics rather than variance or distance metrics. The recent development of fast and simple PP algorithms based on minimization of kurtosis for clustering data has made this powerful tool more accessible, but under conditions where the sample-to-variable ratio is small, PP fails due to opportunistic overfitting of random correlations to limiting distributional targets. Therefore, some kind of variable compression or data regularization is required in these cases.

View Article and Find Full Text PDF

A method is described for the characterization of measurement errors with non-uniform variance (heteroscedastic noise) in contiguous signal vectors (e.g., spectra, chromatograms) that does not require the use of replicated measurements.

View Article and Find Full Text PDF

The differential separation of deuterated and non-deuterated forms of isotopically substituted compounds in chromatography is a well-known but not well-understood phenomenon. This separation is relevant in comparative proteomics, where stable isotopes are used for differential labelling and the effect of isotope resolution on quantitation has been used to disqualify some deuterium labelling methods in favour of heavier isotopes. In this work, a detailed evaluation of the extent of isotopic separation and its impact on quantitation was performed for peptides labelled through dimethylation with H(2)/D(2) formaldehyde.

View Article and Find Full Text PDF

In the analysis of data from high-throughput experiments, information regarding the underlying data structure provides the researcher with confidence in the appropriateness of various analysis methods. One extremely simple but powerful data visualization method is the correlation heat map, whereby correlations between experiments/conditions are calculated and represented using color. In this work, the use of correlation maps to shed light on transcription patterns from DNA microarray time course data prior to gene-level analysis is described.

View Article and Find Full Text PDF

The increased need for multiple statistical comparisons under conditions of non-independence in bioinformatics applications, such as DNA microarray data analysis, has led to the development of alternatives to the conventional Bonferroni correction for adjusting P-values. The use of the false discovery rate (FDR), in particular, has grown considerably. However, the calculation of the FDR frequently depends on drawing random samples from a population, and inappropriate sampling will result in a bias in the calculated FDR.

View Article and Find Full Text PDF

NMR-based metabolomics is characterized by high throughput measurements of the signal intensities of complex mixtures of metabolites in biological samples by assaying, typically, bio-fluids or tissue homogenates. The ultimate goal is to obtain relevant biological information regarding the dissimilarity in patho-physiological conditions that the samples experience. For a long time now, this information has been obtained through the analysis of measured NMR signals via multivariate statistics.

View Article and Find Full Text PDF

DNA microarrays permit the measurement of gene expression across the entire genome of an organism, but the quality of the thousands of measurements is highly variable. For spotted dual-color microarrays the situation is complicated by the use of ratio measurements. Studies have shown that measurement errors can be described by multiplicative and additive terms, with the latter dominating for low-intensity measurements.

View Article and Find Full Text PDF

The conceptual simplicity of DNA microarray technology often belies the complex nature of the measurement errors inherent in the methodology. As the technology has developed, the importance of understanding the sources of uncertainty in the measurements and developing ways to control their influence on the conclusions drawn has become apparent. In this review, strategies for modeling measurement errors and minimizing their effect on the outcome of experiments using a variety of techniques are discussed in the context of spotted, dual-color microarrays.

View Article and Find Full Text PDF

Background: Modeling of gene expression data from time course experiments often involves the use of linear models such as those obtained from principal component analysis (PCA), independent component analysis (ICA), or other methods. Such methods do not generally yield factors with a clear biological interpretation. Moreover, implicit assumptions about the measurement errors often limit the application of these methods to log-transformed data, destroying linear structure in the untransformed expression data.

View Article and Find Full Text PDF

Here we describe an automated, pressure-driven, sampling device for harvesting 10 to 30 ml samples, in replicate, with intervals as short as 10 s. Correlation between biological replicate time courses measured by microarrays was extremely high. The sampler enables sampling at intervals within the range of many important biological processes.

View Article and Find Full Text PDF

DNA microarrays, or "DNA chips", represent a relatively new technology that is having a profound impact on biology and medicine, yet analytical research into this area is somewhat sparse. This article presents an overview of DNA microarrays and their application to gene expression analysis from the perspective of analytical chemistry, treating aspects of array platforms, measurement, image analysis, experimental design, normalization, and data analysis. Typical approaches are described and unresolved issues are discussed, with a view to identifying some of the contributions that might be made by analytical chemists.

View Article and Find Full Text PDF

Most cells on earth exist in a quiescent state. In yeast, quiescence is induced by carbon starvation, and exit occurs when a carbon source becomes available. To understand how cells survive in, and exit from this state, mRNA abundance was examined using oligonucleotide-based microarrays and quantitative reverse transcription-polymerase chain reaction.

View Article and Find Full Text PDF

Maximum likelihood principal component regression (MLPCR) is an errors-in-variables method used to accommodate measurement error information when building multivariate calibration models. A hindrance of MLPCR has been the substantial demand on computational resources sometimes made by the algorithm, especially for certain types of error structures. Operations on these large matrices are memory intensive and time consuming, especially when techniques such as cross-validation are used.

View Article and Find Full Text PDF