ChemDB / Supplements: Articles, Data and Figures Relating to the System
Supplementary Materials
Download
Request page to download all of the chemical isomers in the database.
Implementation
System implementation materials such as the database schema with data definition and source / vendor information table.
Analysis
Data analysis tables and charts based upon ChemDB contents.
Articles and Presentations
Speeding Up Chemical Database Searches Using the Inverted Index: the Convergence of Cheminformatics and Text Search Methods
R. J. Nasr, R. Vernica, C. Li, and P. Baldi. Speeding Up Chemical Database Searches Using a the Inverted Index: the Convergence of Cheminformatics and Text Search Methods. Journal of Chemical Information and Modeling in submission, (2011).
Learning to Predict Chemical Reactions
M. A. Kayala, C. A. Azencott, J. Chen, and P. Baldi Journal of Chemical Information and Modeling 51, 9, 2209-2222 (2011).
Machine Learning Data (300M tar.gz)
When is Chemical Similarity Significant? The Statistical Distribution of Chemical Similarity Scores and Its Extreme Values
P. Baldi and R. Nasr Journal of Chemical Information and Modeling Submitted (2010).
Supplementary Data
No Electron Left-Behind: a Rule-Based Expert System to Predict Chemical Reactions and Reaction Mechanisms
J. Chen and P. Baldi. No Electron Left-Behind: a Rule-Based Expert System to Predict Chemical Reactions and Reaction Mechanisms. Journal of Chemical Information and Modeling, 49, 9, 2034-2043, (2009).
An Intersection Inequality Sharper than the Tanimoto Triangle Inequality for Efficiently Searching Large Databases
P. Baldi and D. Hirschberg. An Intersection Inequality Sharper than the Tanimoto Triangle Inequality for Efficiently Searching Large Databases. Journal of Chemical Information and Modeling, in press, (2009).
Large Scale Study of Multiple-molecule Queries
R. Nasr*, S. Joshua Swamidass*, and P. Baldi. Large scale study of multiple-molecule queries. Journal of Cheminformatics 1, 7, (2009).
Supplementary Data
Speeding Up Chemical Database Searches Using a Proximity Filter Based on the Logical Exclusive-OR
P. Baldi, D. S. Hirschberg and R. J. Nasr. Speeding Up Chemical Database Searches Using a Proximity Filter Based on the Logical Exclusive-OR. Journal of Chemical Information and Modeling 48 (7), 1367-1378, (2008).
100K random chemical data set
Discovery of Power-Laws in Chemical Space
R. W. Benz, S. Joshua Swamidass, and P. Baldi. "Discovery of Power-Laws in Chemical Space." Journal of Chemical Information and Modeling 48(6), 1138-1151, (2008).
Synthesis Explorer: A Chemical Reaction Tutorial System for Organic Synthesis Design and Mechanism Prediction
J. H. Chen and P. Baldi. Synthesis Explorer: A Chemical Reaction Tutorial System for Organic Synthesis Design and Mechanism Prediction. Journal of Chemical Education 2008(85):1699, (2008).
Lossless Compression of Chemical Fingerprints Using Integer Entropy Codes Improves Storage and Retrieval
P. Baldi, R. W. Benz, D. S. Hircshberg, and S. Joshua Swamidass. Lossless Compression of Chemical Fingerprints Using Integer Entropy Codes Improves Storage and Retrieval. Journal of Chemical Information and Modeling, 47, 6, 2098-2109, (2007).
A Mathematical Correction for Fingerprint Similarity Measures to Improve Chemical Retrieval
S. Joshua Swamidass and P. Baldi. A Mathematical Correction for Fingerprint Similarity Measures to Improve Chemical Retrieval. Journal of Chemical Information and Modeling, 47, 3, 952-964, (2007).
Bounds and Algorithms for Exact Searches of Chemical Fingerprints in Linear and Sub-Linear Time
S. Joshua Swamidass and P. Baldi. Bounds and Algorithms for Exact Searches of Chemical Fingerprints in Linear and Sub-Linear Time. Journal of Chemical Information and Modeling, 47, 2, 302-317, (2007).
Virtual High-Throughput Screening with Two-Dimensional Kernels
C. A. Azencott and P. Baldi. Virtual High-Throughput Screening with Two-Dimensional Kernels. In: Hands-On Pattern Recognition. Challenges in Data Representation, Model Selection, and Performance Prediction, I. Guyon, G. Cawley, G. Droor, and A. Saffari Editors, Lulu Press, (2007).
Effective Compression of Monotone and Quasi-Monotone Sequences of Integers
D. S. Hirschberg and P. Baldi. Effective Compression of Monotone and Quasi-Monotone Sequences of Integers. Proceedings of the 2008 Data Compression Conference (DCC 08), Snowbird, UTA, IEEE Computer Society Press, (2008).
ChemDB Update - Full-Text Search and Virtual Chemical Space (Supplementary Materials)
J. H. Chen, E. Linstead, S. Joshua Swamidass, D. Wang, P. Baldi. Bioinformatics Advance Access 2007; doi: 10.1093/bioinformatics/btm341. Bioinformatics 23: 2348-2351.
Chemoinformatics Tutorial, ISMB 2006
P. Baldi presents a tutorial on chemoinformatics at the 2006 ISMB Conference in Fortaleza, Brazil.
One-to-Four-Dimensional Kernels for Small Molecules and Predictive Regression of Physical, Chemical, and Biological Properties (Supplementary Materials)
C. Azencott, A. Ksikes, S. Joshua Swamidass, J. H Chen, L. Ralaivola and P. Baldi. One-to-Four-Dimensional Kernels for Small Molecules and Predictive Regression of Physical, Chemical, and Biological Properties. Journal of Chemical Informatics and Modeling, 47(3):965-974, (2007).
ChemDB: A Public Database of Small Molecules and Related Chemoinformatics Resources
J. Chen*, S. Joshua Swamidass*, Y. Dou, J. Bruand, P. Baldi. ChemDB: A Public Database of Small Molecules and Related Chemoinformatics Resources. Bioinformatics, 21 (22): 4133-4139, (2005).
Kernels for Small Molecules and the Prediction of Mutagenicity, Toxicity, and Anti-Cancer Activity
S. Joshua Swamidass*, J. Chen*, P. Phung, J. Bruand, L. Ralaivola, and P. Baldi. Kernels for Small Molecules and the Prediction of Mutagenicity, Toxicity, and Anti-Cancer Activity. Proceedings of the 2005 Conference on Intelligent Systems for Molecular Biology, ISMB 05. Bioinformatics, 21 (Supplement 1), i359-368, (2005).
Graph Kernels for Chemical Informatics
L. Ralaivola, S. Joshua Swamidass, H. Saigo, and P. Baldi. Graph Kernels for Chemical Informatics. Neural Networks, special issue on Neural Networks and Kernel Methods for Structured Domains, 18 (8): 1093-1110, (2005).
* These authors contributed equally
Disclaimer: Downloadable articles above may be slightly different from published versions.
Home | Contacts