Leading Edge Predictors for Drug Discovery

CSLogWS Home

CSLogD Home

CSLogP Home



CSGenoTox Home



     ...Calculation and Prediction

Data and Statististics of the CSLog P  Predictor

Experimental Data Used for CSLogP

All calculated LogP values were made using CSLogP and the 2D structure of the drug. 

Observed LogP values and their 2D structures were for neutral compounds taken from several sources (2,3,4) and used to compare calculated LogP results employed internally within the CSLogP predictor.

(2). Kow from Sangster Research Laboratories, Montreal Quebec, Canada.

(3). CLogP, Biobyte StarList

(4). PhysProp, Syracuse Research Corp., Syracuse, NY.

Development of the CSLogP Predictor.

The CSLogp predictor is based on topological structure descriptors and was developed by the use of artificial neural networks.  Neural network analysis was applied to select descriptors and then optimize the relationship between experimental Log WS values and those calculated by the CSLogP predictor.  The resulting predictor was cross-validated by a 10 fold leave-10%-out method.

(Please see "Neural Network Analysis" on our Methods page for additional information)

 In each phase of development, a correlation coefficient was calculated as a measure of the quality of the predictor.


gives the correlation between calculated and experimental values for the compounds in the training set.  Every compound in the training set contributed to descriptor selection and predictor development.


gives the correlation coefficient between experimental values and predicted values from a 10-fold leave-10%-out cross validation.  The compounds that generate Q2 contribute to descriptor selection, but the predicted values arise from calculations made when the compounds were not part of the training set for predictor development.

Two additional statistics are given as a measure of predictor performance:
MAE gives the mean absolute error.
s gives the standard deviation for regression.

(Please see "Data Handling and Statistics" on our Methods page for additional information)

Calculation of Log P property values

A correlation of the CSLogP (calculated) values with the known PB (experimental) values gave the following statistics:

R2 = 0.89

MAE = 0.43 (mean absolute error)

s = 0.43

The results are shown in the plot below below.

Calculated Results on 16893 Heterogeneous Compounds

Prediction of Log P property values

A correlation of CSLogP predictions with known (experimental) Log P values gave the following statistics:

Q2 = 0.87

MAE = 0.46 (mean absolute error)

s = 0.46

The above Q2 value should be considered to be excellent.

The results are shown in the plot below below.

Predicted Results on 16893 Heterogeneous Compounds

Back to: CSLogP  Home Page

Go to: Next CSLogP  Topic

user login
contact us

To contact us:

Phone: 978-501-0633

Fax: 781-275-5197

Email:  sales@chemsilico.com

Copyright © 2003 ChemSilico LLC All Rights Reserved

Terms and Conditions of Use | Privacy Policy

ChemSilico is a registered trademark of ChemSilico LLC, Tewksbury, MA 01876