Filtern
Erscheinungsjahr
Dokumenttyp
- Konferenzveröffentlichung (216) (entfernen)
Sprache
- Englisch (216) (entfernen)
Schlagworte
- Bionik (3)
- Gespenstschrecken (3)
- Haftorgan (3)
- adhesion (3)
- stick insects (3)
- Competency-Oriented Exams (2)
- Field measurement (2)
- Solar modules (2)
- 360° Panorama (1)
- AEM-Electrolysis (1)
Institut
- Westfälisches Institut für Gesundheit (49)
- Institut für Internetsicherheit (45)
- Westfälisches Energieinstitut (24)
- Informatik und Kommunikation (21)
- Maschinenbau Bocholt (20)
- Elektrotechnik und angewandte Naturwissenschaften (19)
- Wirtschaft und Informationstechnik Bocholt (7)
- Institut für biologische und chemische Informatik (6)
- Fachbereiche (2)
- Institut Arbeit und Technik (2)
An automated pipeline for comprehensive calculation of intermolecular interaction energies based on molecular force-fields using the Tinker molecular modelling package is presented. Starting with non-optimized chemically intuitive monomer structures, the pipeline allows the approximation of global minimum energy monomers and dimers, configuration sampling for various monomer-monomer distances, estimation of coordination numbers by molecular dynamics simulations, and the evaluation of differential pair interaction energies. The latter are used to derive Flory-Huggins parameters and isotropic particle-particle repulsions for Dissipative Particle Dynamics (DPD). The computational results for force fields MM3, MMFF94, OPLSAA and AMOEBA09 are analyzed with Density Functional Theory (DFT) calculations and DPD simulations for a mixture of the non-ionic polyoxyethylene alkyl ether surfactant C10E4 with water to demonstrate the usefulness of the approach.
Inspired by the super-human performance of deep learning models in playing the game of Go after being presented with virtually unlimited training data, we looked into areas in chemistry where similar situations could be achieved. Encountering large amounts of training data in chemistry is still rare, so we turned to two areas where realistic training data can be fabricated in large quantities, namely a) the recognition of machine-readable structures from images of chemical diagrams and b) the conversion of IUPAC(-like) names into structures and vice versa. In this talk, we outline the challenges, technical implementation and results of this study.
Optical Chemical Structure Recognition (OCSR): Vast amounts of chemical information remain hidden in the primary literature and have yet to be curated into open-access databases. To automate the process of extracting chemical structures from scientific papers, we developed the DECIMER.ai project. This open-source platform provides an integrated solution for identifying, segmenting, and recognising chemical structure depictions in scientific literature. DECIMER.ai comprises three main components: DECIMER-Segmentation, which utilises a Mask-RCNN model to detect and segment images of chemical structure depictions; DECIMER-Image Classifier EfficientNet-based classification model identifies which images contain chemical structures and DECIMER-Image Transformer which acts as an OCSR engine which combines an encoder-decoder model to convert the segmented chemical structure images into machine-readable formats, like the SMILES string.
DECIMER.ai is data-driven, relying solely on the training data to make accurate predictions without hand-coded rules or assumptions. The latest model was trained with 127 million structures and 483 million depictions (4 different per structure) on Google TPU-V4 VMs
Name to Structure Conversion: The conversion of structures to IUPAC(-like) or systematic names has been solved algorithmically or rule-based in satisfying ways. This fact, on the other side, provided us with an opportunity to generate a name-structure training pair at a very large scale to train a proof-of-concept transformer network and evaluate its performance.
In this work, the largest model was trained using almost one billion SMILES strings. The Lexichem software utility from OpenEye was employed to generate the IUPAC names used in the training process. STOUT V2 was trained on Google TPU-V4 VMs. The model's accuracy was validated through one-to-one string matching, BLEU scores, and Tanimoto similarity calculations. To further verify the model's reliability, every IUPAC name generated by STOUT V2 was analysed for accuracy and retranslated using OPSIN, a widely used open-source software for converting IUPAC names to SMILES. This additional validation step confirmed the high fidelity of STOUT V2's translations.
The DECIMER.ai Project
(2024)
Over the past few decades, the number of publications describing chemical structures and their metadata has increased significantly. Chemists have published the majority of this information as bitmap images along with other important information as human-readable text in printed literature and have never been retained and preserved in publicly available databases as machine-readable formats. Manually extracting such data from printed literature is error-prone, time-consuming, and tedious. The recognition and translation of images of chemical structures from printed literature into machine-readable format is known as Optical Chemical Structure Recognition (OCSR). In recent years, deep-learning-based OCSR tools have become increasingly popular. While many of these tools claim to be highly accurate, they are either unavailable to the public or proprietary. Meanwhile, the available open-source tools are significantly time-consuming to set up. Furthermore, none of these offers an end-to-end workflow capable of detecting chemical structures, segmenting them, classifying them, and translating them into machine-readable formats.
To address this issue, we present the DECIMER.ai project, an open-source platform that provides an integrated solution for identifying, segmenting, and recognizing chemical structure depictions within the scientific literature. DECIMER.ai comprises three main components: DECIMER-Segmentation, which utilizes a Mask-RCNN model to detect and segment images of chemical structure depictions; DECIMER-Image Classifier EfficientNet-based classification model identifies which images contain chemical structures and DECIMER-Image Transformer which acts as an OCSR engine which combines an encoder-decoder model to convert the segmented chemical structure images into machine-readable formats, like the SMILES string.
A key strength of DECIMER.ai is that its algorithms are data-driven, relying solely on the training data to make accurate predictions without any hand-coded rules or assumptions. By offering this comprehensive, open-source, and transparent pipeline, DECIMER.ai enables automated extraction and representation of chemical data from unstructured publications, facilitating applications in chemoinformatics and drug discovery.
In the realm of digital situational awareness during disaster situations, accurate digital representations,
like 3D models, play an indispensable role. To ensure the
safety of rescue teams, robotic platforms are often deployed
to generate these models. In this paper, we introduce an
innovative approach that synergizes the capabilities of compact Unmaned Arial Vehicles (UAVs), smaller than 30 cm, equipped with 360° cameras and the advances of Neural Radiance Fields (NeRFs). A NeRF, a specialized neural network, can deduce a 3D representation of any scene using 2D images and then synthesize it from various angles upon request. This method is especially tailored for urban environments which have experienced significant destruction, where the structural integrity of buildings is compromised to the point of barring entry—commonly observed post-earthquakes and after severe fires. We have tested our approach through recent post-fire scenario, underlining the efficacy of NeRFs even in challenging outdoor environments characterized by water, snow, varying light conditions, and reflective surfaces.
In this paper, we present a method for detecting objects of interest, including cars, humans, and fire, in aerial images captured by unmanned aerial vehicles (UAVs) usually during vegetation fires. To achieve this, we use artificial neural networks and create a dataset for supervised learning. We accomplish the assisted labeling of the dataset through the implementation of an object detection pipeline that combines classic image processing techniques with pretrained neural networks. In addition, we develop a data augmentation pipeline to augment the dataset with utomatically labeled images. Finally, we evaluate the performance of different neural networks.
Measurement studies are essential for research and industry alike to understand the Web’s inner workings better and help quantify specific phenomena. Performing such studies is demanding due to the dynamic nature and size of the Web. An experiment’s careful design and setup are complex, and many factors might affect the results. However, while several works have independently observed differences in
the outcome of an experiment (e.g., the number of observed trackers) based on the measurement setup, it is unclear what causes such deviations. This work investigates the reasons for these differences by visiting 1.7M webpages with five different measurement setups. Based on this, we build ‘dependency trees’ for each page and cross-compare the nodes in the trees. The results show that the measured trees differ considerably, that the cause of differences can be attributed to specific nodes, and that even identical measurement setups can produce different results.