Compositional data are characterized by the fact that their elemental information is contained in simple pairwise logratios of the parts that constitute the composition. While pairwise logratios are typically easy to interpret, the number of possible pairs to consider quickly becomes too large even for medium-sized compositions, which may hinder interpretability in further multivariate analysis. Sparse methods can therefore be useful for identifying a few important pairwise logratios (and parts contained in them) from the total candidate set.
View Article and Find Full Text PDFJ Archaeol Method Theory
December 2024
Unlabelled: The expansion of the Neolithic way of life triggered the most profound changes in peoples' socioeconomic behaviors, including how critical resources for everyday life were managed. Recent research spearheaded by ancient DNA analysis has greatly contributed to our understanding of the main direction of Neolithisation spreading from western Anatolia into central Europe. Due to the diverse processes involved in Neolithisation, which resulted in a high diversity of regional and local phenomena, the underlying mechanisms of these developments are still largely unexplored.
View Article and Find Full Text PDFModern science and industry rely on computational models for simulation, prediction, and data analysis. Spatial blind source separation (SBSS) is a model used to analyze spatial data. Designed explicitly for spatial data analysis, it is superior to popular non-spatial methods, like PCA.
View Article and Find Full Text PDFAnal Chim Acta
October 2023
Data sets derived from practical experiments often pose challenges for (robust) statistical methods. In high-dimensional data sets, more variables than observations are recorded and often, there are also data present that do not follow the structure of the data majority. In order to handle such data with outlying observations, a variety of robust regression and classification methods have been developed for low-dimensional data.
View Article and Find Full Text PDFCompositional data are commonly known as multivariate observations carrying relative information. Even though the case of vector or even two-factorial compositional data (compositional tables) is already well described in the literature, there is still a need for a comprehensive approach to the analysis of multi-factorial relative-valued data. Therefore, this contribution builds around the current knowledge about compositional data a general theoretical framework for -factorial compositional data.
View Article and Find Full Text PDFIntroduction: We investigated whether governmental measures and lockdowns during the COVID-19 pandemic had an impact on the number and histopathologic stages of melanoma.
Methods: The number and thickness (Breslow) of all diagnosed melanomas per day, month, or period at the 'Institute for Pathology in the Centre' in 2019 and 2020 were compared. For 2020, we defined four time periods: Period 1: 1 January-15 March; Period 2: 16 March-15 May (Lockdown 1); Period 3: 16 May-2 November; Period 4: 3 November-7 December (Lockdown 2).
Many geological phenomena are regularly measured over time to follow developments and changes. For many of these phenomena, the absolute values are not of interest, but rather the relative information, which means that the data are compositional time series. Thus, the serial nature and the compositional geometry should be considered when analyzing the data.
View Article and Find Full Text PDFBioinformatics
November 2021
Motivation: High-throughput sequencing technologies generate a huge amount of data, permitting the quantification of microbiome compositions. The obtained data are essentially sparse compositional data vectors, namely vectors of bacterial gene proportions which compose the microbiome. Subsequently, the need for statistical and computational methods that consider the special nature of microbiome data has increased.
View Article and Find Full Text PDFThe instrument COSIMA (COmetary Secondary Ion Mass Analyzer) onboard of the European Space Agency mission Rosetta collected and analyzed dust particles in the neighborhood of comet 67P/Churyumov-Gerasimenko. The chemical composition of the particle surfaces was characterized by time-of-flight secondary ion mass spectrometry. A set of 2213 spectra has been selected, and relative abundances for CH-containing positive ions as well as positive elemental ions define a set of multivariate data with nine variables.
View Article and Find Full Text PDFData outliers can carry very valuable information and might be most informative for the interpretation. Nevertheless, they are often neglected. An algorithm called cellwise outlier diagnostics using robust pairwise log ratios (cell-rPLR) for the identification of outliers in single cell of a data matrix is proposed.
View Article and Find Full Text PDFSci Total Environ
March 2019
Sewage sludge (SS) reuse in forest plantation as soil fertilizer/amendment has tremendously increased in recent years. However, SS may have high concentrations of potentially toxic elements (PTE), representing a potential risk for soil and the whole ecosystem. This paper was aimed to assess the toxicity of PTE in unfertile tropical soils amended with SS in a commercial Eucalyptus plantation, with an integrated multiple approaches combining: i) the use of a battery of bioassays (Daphnia magna, Pseudokcrichirella subcapitata, Lactuca sativa, and Allium cepa); and ii) the evaluation of some PTE (Cd, Cr, Cu, Fe, Mn, Ni, Pb, and Zn) and their availability into the pedoenvironment.
View Article and Find Full Text PDFAlthough Scandinavian flint is one of the most important materials used for prehistoric stone tool production in Northern and Central Europe, a conclusive method for securely differentiating between flint sources, geologically bound to northern European chalk formations, has never been achieved. The main problems with traditional approaches concern the oftentimes high similarities of SiO2 raw materials (i.e.
View Article and Find Full Text PDFSardinia (Italy), the second largest island of the Mediterranean Sea, is a fire-prone land. Most Sardinian environments over time were shaped by fire, but some of them are too intrinsically fragile to withstand the currently increasing fire frequency. Calcareous pedoenvironments represent a significant part of Mediterranean areas, and require important efforts to prevent long-lasting degradation from fire.
View Article and Find Full Text PDFGeochemical element separation is studied in 14 different sample media collected at 41 sites along an approximately 100-km long transect north of Oslo. At each site, soil C and O horizons and 12 plant materials (birch/spruce/cowberry/blueberry leaves/needles and twigs, horsetail, braken fern, pine bark and terrestrial moss) were sampled. The observed concentrations of 29 elements (K, Ca, P, Mg, Mn, S, Fe, Zn, Na, B, Cu, Mo, Co, Al, Ba, Rb, Sr, Ti, Ni, Pb, Cs, Cd, Ce, Sn, La, Tl, Y, Hg, Ag) were used to investigate soil-plant relations, and to evaluate the element differentiation between different plants, or between foliage and twigs of the same plant.
View Article and Find Full Text PDFCompositional data analysis refers to analyzing relative information, based on ratios between the variables in a data set. Data from epidemiology are usually treated as absolute information in an analysis. We outline the differences in both approaches for univariate and multivariate statistical analyses, using illustrative data sets from Austrian districts.
View Article and Find Full Text PDFThe guidelines for setting environmental quality standards are increasingly based on probabilistic risk assessment due to a growing general awareness of the need for probabilistic procedures. One of the commonly used tools in probabilistic risk assessment is the species sensitivity distribution (SSD), which represents the proportion of species affected belonging to a biological assemblage as a function of exposure to a specific toxicant. Our focus is on the inverse use of the SSD curve with the aim of estimating the concentration, HCp, of a toxic compound that is hazardous to p% of the biological community under study.
View Article and Find Full Text PDFWhile analyzing chromatographic data, it is necessary to preprocess it properly before exploration and/or supervised modeling. To make chromatographic signals comparable, it is crucial to remove the scaling effect, caused by differences in overall sample concentrations. One of the efficient methods of signal scaling is Probabilistic Quotient Normalization (PQN) [1].
View Article and Find Full Text PDFBackground: In the field of root biology there has been a remarkable progress in root phenotyping, which is the efficient acquisition and quantitative description of root morphology. What is currently missing are means to efficiently explore, exchange and present the massive amount of acquired, and often time dependent root phenotypes.
Results: In this work, we present visual summaries of root ensembles by aggregating root images with identical genetic characteristics.
Int J Biometeorol
July 2017
Long-term changes of plant phenological phases determined by complex interactions of environmental factors are in the focus of recent climate impact research. There is a lack of studies on the comparison of biogeographical regions in Europe in terms of plant responses to climate. We examined the flowering phenology of plant species to identify the spatio-temporal patterns in their responses to environmental variables over the period 1970-2010.
View Article and Find Full Text PDFCompositional data, as they typically appear in geochemistry in terms of concentrations of chemical elements in soil samples, need to be expressed in log-ratio coordinates before applying the traditional statistical tools if the relative structure of the data is of primary interest. There are different possibilities for this purpose, like centered log-ratio coefficients, or isometric log-ratio coordinates. In both the approaches, geometric means of the compositional parts are involved, and it is unclear how measurement errors or detection limit problems affect their presentation in coordinates.
View Article and Find Full Text PDFFront Hum Neurosci
August 2014
Machine learning classifiers have become increasingly popular tools to generate single-subject inferences from fMRI data. With this transition from the traditional group level difference investigations to single-subject inference, the application of machine learning methods can be seen as a considerable step forward. Existing studies, however, have given scarce or no information on the generalizability to other subject samples, limiting the use of such published classifiers in other research projects.
View Article and Find Full Text PDFIn order to assess whole-brain resting-state fluctuations at a wide range of frequencies, resting-state fMRI data of 20 healthy subjects were acquired using a multiband EPI sequence with a low TR (354 ms) and compared to 20 resting-state datasets from standard, high-TR (1800 ms) EPI scans. The spatial distribution of fluctuations in various frequency ranges are analyzed along with the spectra of the time-series in voxels from different regions of interest. Functional connectivity specific to different frequency ranges (<0.
View Article and Find Full Text PDFIEEE Trans Vis Comput Graph
December 2013
Model selection in time series analysis is a challenging task for domain experts in many application areas such as epidemiology, economy, or environmental sciences. The methodology used for this task demands a close combination of human judgement and automated computation. However, statistical software tools do not adequately support this combination through interactive visual interfaces.
View Article and Find Full Text PDF