Unsupervised Data Base Clustering Based on Daylight's Fingerprint and Tanimoto Similarity: A Fast and Automated Way To Cluster Small and Large Data Sets

article by Darko Butina published July 1999 in Journal of Chemical Information and Computer Sciences

Unsupervised Data Base Clustering Based on Daylight's Fingerprint and Tanimoto Similarity: A Fast and Automated Way To Cluster Small and Large Data Sets is …
instance of (P31):
scholarly articleQ13442814

External links are
P8978DBLP publication IDjournals/jcisd/Butina99
P356DOI10.1021/CI9803381
P5875ResearchGate publication ID220522161

P2093author name stringDarko Butina
P2860cites workFeatures of similarityQ56454678
P433issue4
P921main subjectautomationQ184199
P304page(s)747-750
P577publication date1999-07-01
P1433published inJournal of Chemical Information and Computer SciencesQ104614957
P1476titleUnsupervised Data Base Clustering Based on Daylight's Fingerprint and Tanimoto Similarity: A Fast and Automated Way To Cluster Small and Large Data Sets
P478volume39

Reverse relations

cites work (P2860)
Q27660498A Small-Molecule Inhibitor of BCL6 Kills DLBCL Cells In Vitro and In Vivo
Q45961615A binary ant colony optimization classifier for molecular activities.
Q35114229A chemo-centric view of human health and disease
Q41326213A de novo substructure generation algorithm for identifying the privileged chemical fragments of liver X receptorβ agonists
Q30825633A fast clustering algorithm for analyzing highly similar compounds of very large libraries
Q104699083A machine learning workflow for molecular analysis: application to melting points
Q92137473A nonparametric weighted feature extraction-based method for c-Jun N-terminal kinase-3 inhibitor prediction
Q115454890A revised range of variability approach for the comprehensive assessment of the alteration of flow regime
Q36172430A robust clustering method for chemical structures
Q43961336ADMET rules of thumb II: A comparison of the effects of common substituents on a range of ADMET parameters
Q90093706All-Assay-Max2 pQSAR: Activity Predictions as Accurate as Four-Concentration IC50s for 8558 Novartis Assays
Q57149268An Analysis of Different Components of a High-Throughput Screening Library
Q130707146An Interpreted Atlas of Biosynthetic Gene Clusters from 1000 Fungal Genomes
Q36083038An NMR-Guided Screening Method for Selective Fragment Docking and Synthesis of a Warhead Inhibitor.
Q41472175An evaluation of in-house and off-the-shelf in silico models: implications on guidance for mutagenicity assessment
Q129257610An interpreted atlas of biosynthetic gene clusters from 1,000 fungal genomes
Q62494824Artificial intelligence in drug discovery
Q27902283Atom-Atom-Path similarity and Sphere Exclusion clustering: tools for prioritizing fragment hits
Q114969995AutoGrow4: an open-source genetic algorithm for de novo drug design and lead optimization
Q35176797Automated Selection of Compounds with Physicochemical Properties To Maximize Bioavailability and Druglikeness
Q94567295Automated identification of chemical series: Classifying like a medicinal chemist
Q34289511Automated recycling of chemistry for virtual screening and library design
Q45964726Automatic QSAR modeling of ADME properties: blood-brain barrier penetration and aqueous solubility.
Q130705682Bacterial cytochrome P450s: a bioinformatics odyssey of substrate discovery
Q108590228BonMOLière: Small-Sized Libraries of Readily Purchasable Compounds, Optimized to Produce Genuine Hits in Biological Screens across the Protein Space
Q38518820Breaking free from chemical spreadsheets
Q28660148CASMI: And the Winner is . .
Q30914165Calculating similarities between biological activities in the MDL Drug Data Report database
Q35762145Characterization of ATP-independent ERK inhibitors identified through in silico analysis of the active ERK2 structure
Q34743597Chemical database techniques in drug discovery
Q53017688Chemical fragment spaces for de novo design
Q116852695Closed-loop optimization of general reaction conditions for heteroaryl Suzuki-Miyaura coupling
Q57692973Clustering Chemical Databases Using Adaptable Projection Cells and MCS Similarity Values
Q35881602Common features of antibacterial compounds: an analysis of 104 compounds library
Q97564332Comparative Assessment of Protein Kinase Inhibitors in Public Databases and in PKIDB
Q30362572Comparison of topological, shape, and docking methods in virtual screening.
Q36113973Compound Prioritization in Single-Concentration Screening Data Using Ligand Efficiency Indexes.
Q37138411Computational chemistry approaches to drug discovery in signal transduction.
Q47133184Computer-Assisted Retrosynthesis Based on Molecular Similarity.
Q127323011Concurrent Optimization of Organic Donor–Acceptor Pairs through Machine Learning
Q62669365Conformator: A Novel Method for the Generation of Conformer Ensembles
Q34440071Consensus Induced Fit Docking (cIFD): methodology, validation, and application to the discovery of novel Crm1 inhibitors
Q41968231Counting clusters using R-NN curves
Q45943968Coupling Matched Molecular Pairs with Machine Learning for Virtual Compound Optimization.
Q91950850DStruBTarget: Integrating Binding Affinity with Structure Similarity for Ligand-Binding Protein Prediction
Q51972706Database clustering with a combination of fingerprint and maximum common substructure methods
Q90331243De Novo Molecule Design by Translating from Reduced Graphs to SMILES
Q113308913De novo protein fold families expand the designable ligand binding site space
Q33496226Design and NMR-based screening of LEF, a library of chemical fragments with different local environment of fluorine
Q90633197Design and Selection of Novel C1s Inhibitors by In Silico and In Vitro Approaches
Q33417723Design of compound libraries for fragment screening
Q62741700Designing in the Face of Uncertainty: Exploiting Electronic Structure and Machine Learning Models for Discovery in Inorganic Chemistry
Q92651028Determination of Absolute Stereochemistry of Flexible Molecules Using a Vibrational Circular Dichroism Spectra Alignment Algorithm
Q95657601Determining the Regio- and Relative Stereochemistry of Small and Drug-like Molecules Using an Alignment Algorithm for Infrared Spectra
Q42011227Development and validation of an improved algorithm for overlaying flexible molecules
Q34455749Development of a novel fingerprint for chemical reactions and its application to large-scale reaction classification and similarity
Q96027652Direct Comparison of Total Clearance Prediction: Computational Machine Learning Model versus Bottom-up Approach Using In Vitro Assay
Q42680012Discovery by organism based high-throughput screening of new multi-stage compounds affecting Schistosoma mansoni viability, egg formation and production.
Q40335742Discovery of Cdc25A Lead Inhibitors with a Novel Chemotype by Virtual Screening: Application of Pharmacophore Modeling Based on a Training Set with a Limited Number of Unique Components
Q98392626Discovery of Novel Inhibitors of a Critical Brain Enzyme Using a Homology Model and a Deep Convolutional Neural Network
Q36960711Discovery of drug-like inhibitors of an essential RNA-editing ligase in Trypanosoma brucei
Q28220975Do structurally similar molecules have similar biological activity?
Q33687130Docking Small Molecules to Predicted Off-Targets of the Cancer Drug Erlotinib Leads to Inhibitors of Lung Cancer Cell Proliferation with Suitable In vitro Pharmacokinetic Properties
Q43512278Docking molecules by families to increase the diversity of hits in database screens: computational strategy and experimental evaluation
Q33738554EDULISS: a small-molecule database with data-mining and pharmacophore searching capabilities
Q41980932Empirical regioselectivity models for human cytochromes P450 3A4, 2D6, and 2C9.
Q62491667Enhancing Retrosynthetic Reaction Prediction with Deep Learning Using Multiscale Reaction Classification
Q42183610Enhancing the rate of scaffold discovery with diversity-oriented prioritization
Q45964506Evolving interpretable structure-activity relationship models. 2. Using multiobjective optimization to derive multiple models.
Q95652189FADB-China: A molecular-level food adulteration database in China based on molecular fingerprints and similarity algorithms prediction expansion
Q93027533Fantastic Liquids and Where To Find Them: Optimizations of Discrete Chemical Space
Q33225159Focused library design in GPCR projects on the example of 5-HT(2c) agonists: comparison of structure-based virtual screening with ligand-based search methods
Q50907163Fragment virtual screening based on Bayesian categorization for discovering novel VEGFR-2 scaffolds
Q64986036Fragment-Based Ligand-Protein Contact Statistics: Application to Docking Simulations.
Q50891672Fragment-based similarity searching with infinite color space
Q78364650Fragmental approach in QSPR
Q34544301Fueling open-source drug discovery: 177 small-molecule leads against tuberculosis
Q112722927GPCR_LigandClassify.py; a rigorous machine learning classifier for GPCR targeting compounds
Q42053492GPU Accelerated Chemical Similarity Calculation for Compound Library Comparison
Q28821466GPU-accelerated Chemical Similarity Assessment for Large Scale Databases
Q48218377Gaussian processes for classification: QSAR modeling of ADMET and target activity
Q45965307Gaussian processes: a method for automatic QSAR modeling of ADME properties.
Q50772718Generation of a focused set of GSK compounds biased toward ligand-gated ion-channel ligands
Q39231765Global quantitative structure-activity relationship models vs selected local models as predictors of off-target activities for project compounds
Q91802550High Impact: The Role of Promiscuous Binding Sites in Polypharmacology
Q38958067High-Quality Dataset of Protein-Bound Ligand Conformations and Its Application to Benchmarking Conformer Ensemble Generators
Q47172785Hit Dexter: A Machine-learning Model for the Prediction of Frequent Hitters.
Q60312595Identification and binding mode of a novel Leishmania Trypanothione reductase inhibitor from high throughput screening
Q40509580Identification and characterization of small molecule inhibitors of the calcium-dependent S100B-p53 tumor suppressor interaction.
Q28551394Identification of New Molecular Entities (NMEs) as Potential Leads against Tuberculosis from Open Source Compound Repository
Q106811884Identification of Plasmodium falciparum heat shock 90 inhibitors via molecular docking
Q40401482Identification of novel extracellular signal-regulated kinase docking domain inhibitors.
Q34081138Identification of novel nonsteroidal compounds as substrates or inhibitors of hASBT.
Q92828148Identification of potential Zika virus NS2B-NS3 protease inhibitors via docking, molecular dynamics and consensus scoring-based virtual screening
Q37104928Identification of small molecular weight inhibitors of Src homology 2 domain-containing tyrosine phosphatase 2 (SHP-2) via in silico database screening combined with experimental assay
Q61451256Identifying Protein Features Responsible for Improved Drug Repurposing Accuracies Using the CANDO Platform: Implications for Drug Design
Q52676386Identifying inhibitors of the Leishmania inositol phosphorylceramide synthase with antiprotozoal activity using a yeast-based assay and ultra-high throughput screening platform
Q21284354In silico fragmentation for computer assisted identification of metabolite mass spectra
Q115741795In silico prediction of chronic toxicity with chemical category approaches
Q56138208Incorporating sequential information into traditional classification models by using an element/position-sensitive SAM
Q45943711Insights into the Molecular Basis of the Acute Contact Toxicity of Diverse Organic Chemicals in the Honey Bee.
Q101404642KinFragLib: Exploring the Kinase Inhibitor Space Using Subpocket-Focused Fragmentation and Recombination
Q91535213Learning Retrosynthetic Planning through Simulated Experience
Q62741713Leveraging Cheminformatics Strategies for Inorganic Discovery: Application to Redox Potential Design
Q60935938Ligand-Based Pharmacophore Modeling Using Novel 3D Pharmacophore Signatures
Q97536771Ligand-Profile Based Virtual Screening of Human GPCRs
Q102326889Machine learning dihydrogen activation in the chemical space surrounding Vaska's complex
Q104133078Memory-assisted reinforcement learning for diverse molecular de novo design
Q34620986MetFusion: integration of compound identification strategies
Q35685388Mixed Inhibition of Adenosine Deaminase Activity by 1,3-Dinitrobenzene: A Model for Understanding Cell-Selective Neurotoxicity in Chemically-Induced Energy Deprivation Syndromes in Brain
Q34260618Modern Phenotypic Drug Discovery Is a Viable, Neoclassic Pharma Strategy
Q58621816Molecular Interactions in Crystal Structures with Z′ > 1
Q42766915Molecular de-novo design through deep reinforcement learning
Q82321086Molecular transformations as a way of finding and exploiting consistent local QSAR
Q27728469Mycobacterium tuberculosis Malate Synthase Structures with Fragments Reveal a Portal for Substrate/Product Exchange
Q27683610Natural-product-derived fragments for fragment-based ligand discovery
Q97638912Network Analysis for Prioritizing Biodegradation Metabolites of Polycyclic Aromatic Hydrocarbons
Q34514081Novel G-quadruplex stabilizing agents: in-silico approach and dynamics
Q42380665Novel Noncatalytic Substrate-Selective p38α-Specific MAPK Inhibitors with Endothelial-Stabilizing and Anti-Inflammatory Activity.
Q61801002OptiPharm: An evolutionary algorithm to compare shape similarity
Q59606384PKRank: a novel learning-to-rank method for ligand-based virtual screening using pairwise kernel and RankSVM
Q30455359Paradigm shift in toxicity testing and modeling
Q34095175Physicochemical profile of macrolides and their comparison with small molecules.
Q51095881Planning chemical syntheses with deep neural networks and symbolic AI.
Q92377701Predicting Molecular Energy Using Force-Field Optimized Geometries and Atomic Vector Representations Learned from an Improved Deep Tensor Neural Network
Q95279090Prediction and Optimization of NaV1.7 Sodium Channel Inhibitors Based on Machine Learning and Simulated Annealing
Q96616413Prediction of 5-hydroxytryptamine transporter inhibitors based on machine learning
Q113204816Prediction of Compound Synthesis Accessibility Based on Reaction Knowledge Graph
Q108591195Prediction of Drug-Induced Liver Toxicity Using SVM and Optimal Descriptor Sets
Q92869093Prediction of Ligands Binding Acetylcholinesterase with Potential Antidotal Activity: A Virtual Screening Approach
Q90919827Prediction of pKa Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines
Q27902296PubChem structure–activity relationship (SAR) clusters
Q92993644Pyrene-Modified DNA Aptamers with High Affinity to Wild-Type EGFR and EGFRvIII
Q91853862Ranking Molecules with Vanishing Kernels and a Single Parameter: Active Applicability Domain Included
Q46946319Rapid Method Development in Hydrophilic Interaction Liquid Chromatography for Pharmaceutical Analysis Using a Combination of Quantitative Structure-Retention Relationships and Design of Experiments.
Q44001483Rapid Shape-Based Ligand Alignment and Virtual Screening Method Based on Atom/Feature-Pair Similarities and Volume Overlap Scoring
Q33223870Reagent Selector: using Synthon Analysis to visualize reagent properties and assist in combinatorial library design
Q57693009Representation of the Molecular Topology of Cyclical Structures by Means of Cycle Graphs. 2. Application to Clustering of Chemical Databases
Q51922027Representing clusters using a maximum common edge substructure algorithm applied to reduced graphs and molecular graphs
Q51693932SCISSORS: a linear-algebraical technique to rapidly approximate chemical similarities
Q40398283SIML: a fast SIMD algorithm for calculating LINGO chemical similarities on GPUs and CPUs
Q122963839SMILES-based deep generative scaffold decorator for de-novo drug design
Q92269008Sampling and refinement protocols for template-based macrocycle docking: 2018 D3R Grand Challenge 4
Q112574166ScaffComb: A Phenotype-Based Framework for Drug Combination Virtual Screening in Large-Scale Chemical Datasets
Q45915393Scaffold hopping in drug discovery using inductive logic programming.
Q58581975Screening Library Design
Q101212045Semi-supervised Hierarchical Drug Embedding in Hyperbolic Space
Q108527465Similarity methods in chemoinformatics
Q81035487Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR
Q35683088Small-molecule proteostasis regulators for protein conformational diseases
Q111154870Splitting chemical structure data sets for federated privacy-preserving machine learning
Q50953110Statistical tools for virtual screening
Q57692843Structural-Similarity-Based Approaches for the Development of Clustering and QSPR/QSAR Models in Chemical Databases
Q34787536Structure-based design of small-molecule ligands of phosphofructokinase-2 activating or inhibiting glycolysis.
Q33227352Surrogate docking: structure-based virtual screening at high throughput speed
Q36408624Systematic Data Mining Reveals Synergistic H3R/MCHR1 Ligands.
Q27902314Target enhanced 2D similarity search by using explicit biological activity annotations and profiles
Q50938467Targeting Human Poly(ADP-Ribose) Polymerase-1 with Natural Medicines and Its Potential Applications in Ovarian Cancer Therapeutics
Q27657225Targeting NAD Biosynthesis in Bacterial Pathogens: Structure-Based Development of Inhibitors of Nicotinate Mononucleotide Adenylyltransferase NadD
Q34038504Targeting zymogen activation to control the matriptase-prostasin proteolytic cascade.
Q64067149TeachOpenCADD: a teaching platform for computer-aided drug design using open source packages and data
Q128208414The ChemicalToolbox: reproducible, user-friendly cheminformatics analysis on the Galaxy platform
Q46536177The FPS fingerprint format and chemfp toolkit.
Q40914099The Relative Importance of Domain Applicability Metrics for Estimating Prediction Errors in QSAR Varies with Training Set Diversity
Q100296707The anti-STAT1 polyphenol myricetin inhibits M1 microglia activation and counteracts neuronal death
Q94217163The chemfp project
Q112564482The development and application of in silico models for drug induced liver injury
Q31034745The emerging importance of predictive ADME simulation in drug discovery
Q40111526Toward a class-independent quantitative structure--activity relationship model for uncouplers of oxidative phosphorylation
Q93037525Ultrahigh binding affinity of a hydrocarbon guest inside cucurbit[7]uril enhanced by strong host-guest charge matching
Q45281821Variable selection and model validation of 2D and 3D molecular descriptors
Q28068082Virtual Screening Approaches towards the Discovery of Toll-Like Receptor Modulators
Q41588963Virtual-screening workflow tutorials and prospective results from the Teach-Discover-Treat competition 2014 against malaria
Q111150104Virtual-screening workflow tutorials and prospective results from the Teach-Discover-Treat competition 2014 against malaria
Q40521899WONKA: objective novel complex analysis for ensembles of protein-ligand structures.
Q52582777When Is Ligand p Ka a Good Descriptor for Catalyst Energetics? In Search of Optimal CO2 Hydration Catalysts.
Q35041769XenoSite: accurately predicting CYP-mediated sites of metabolism with neural networks.
Q47673468iCDI-PseFpt: identify the channel-drug interaction in cellular networking with PseAAC and molecular fingerprints
Q28661179iEzy-drug: a web server for identifying the interaction between enzymes and drugs in cellular networking
Q28535719iGPCR-drug: a web server for predicting interaction between GPCRs and drugs in cellular networking

Search more.