Leading Edge Predictors for Drug Discovery

CSLogWS Home

CSLogD Home

CSLogP Home

CSpKa Home



CSGenoTox Home


Download a Pre-print
about CSHIA


     ...Predictor Comparison

Comparison of Published HIA Models

Published Continuous HIA Prediction Models:

A number of studies involving the prediction of continuous HIA values have been published.  A statistical comparison of several of the prominent publications with the CSHIA predictor is given below.

Modeling Techniques and Descriptors Used in Published Models:

The majority of the published continuous HIA models employed MLR or sigmodal fits to the experimental HIA data. The exceptions to this are Wessel et al (1) and ChemSilico CSHIA, where neural net techniques were applied.  In general, datasets were small (n < 200), with the exception of Klopman et al (4), who developed a MLR-fragment-based model based on a 417 compound training set with 50 compounds (~11%) set aside as a validation set.  CSHIA employed by far the largest data set with 612 compounds divided into 412 for the training set and 195 (~32%) for the external validation set.

There was significant variation in the number and type of descriptors used in the various models.  The majority of the models, including CSHIA, employ explicit H-bonding type descriptors and polar or charged surface area descriptors with the exception of Klopman where only fragments were employed.  The study of Klopman did utilize H-bond acceptors and donors within fragments.

Wessel (1)

6 descriptors/81 compounds: 1 topological, 3 charged-partial surface area (CPSA) hydrogen bonding, and 2 geometric descriptors.

• Artificial Neural Networks to fit the data.

Zhao (2)

5 descriptors/169 compounds: All general solvation parameters

Excess molar refraction, dipolarity/polarizability, H-bond acidity and basicity, and molecular volume.

• MLR used to fit the data.

Derety (3)

3 descriptors/124 compounds: LogP and count of hydrogen bond donors and acceptors.

• Sigmodal fit to the data.

Klopman (4)

37 descriptors/437 compounds: All descriptors fragment based.

• MLR used to fit the data.

Palm (5)

1 descriptor/20 compounds: Polar surface area (PSA) on a limited dataset.

• Sigmodal fit to the data.


29 descriptors/617 compounds: 14 E-states from hydrogen bonding centers, CSLogP, hydrogen bonding surface area, 11 molecular connectivity indices, and 1 charged center E-state.

• Artificial Neural Networks to fit the data.

    (1)  Wessal et al (1998), J. Chem. Info. Comput. Sci. 38, 726-735

    (2)  Zhao et al (2001), J. Pharm. Sci. 90, 749-784

    (3)  Derety ets al (2002), Quant. Struct-Act. Relat. 21, 493-506

    (4)  Klopman G. et al (2002), Eur. J. Pharm. Sci. 17, 253

    (5)  Palm et al (1997), Pharm. Res. 14, 568-571

Statistical Comparison of HIA models:

With the exception of ChemSilico (see Experimental section), the published datasets were heavily skewed, with few compounds below 5% and the majority (~90% of the compounds) above 80% in their reported HIA values.  This substantially reduces the number of necessary non-fragmental descriptors to model HIA.  This was evident in the Derety, Zhao and Palm datasets.

HIA (%oral absorption) measurements have a substantial margin of error associated with them. This is reflected in the RMSEs (root mean square errors) given in Table 3.  All models gave reasonable RMSE values with respect to validation datasets.  With exclusion of drugs that are dose-limited or undergo carrier/facilitated transport, drug formulation and/or poor aqueous solubility can affect predictive results for HIA.  These two factors, solubility and formulation, are reflected in the reported RMSE values for ChemSilico.  9% of compounds from its 195 compound external validation dataset had an AE (absolute error) >25%.  Six of these compounds had very poor aqueous solubility.  Nonetheless, the ChemSilico results were superior to all models in scope of compounds covered and the modelís resultant RMSE values.

Back to: CSHIA  Home Page

user login
contact us

To contact us:

Phone: (888) 636-8777

Fax: 781-275-5197

Email:  sales@chemsilico.com

Copyright © 2003 ChemSilico LLC All Rights Reserved

Terms and Conditions of Use | Privacy Policy

ChemSilico is a registered trademark of ChemSilico LLC, Tewksbury, MA 01876