To build agood QSAR model, a minimal set of information-rich descriptors is required. The large number of possible indices creates several problems for the modeler [57, 58].

1. Many descriptors do not contain molecular information relevant to the problem.
2. Many descriptors are linearly dependant (contain essentially the same information).
3. Use of poor descriptors in QSAR yields poor and misleading models.
4. Including too many descriptors in the model, even if they contain relevant information, can result in over fitting of the model, and loss of ability of the model to generalize to unseen molecules.
5. Many methods of screening this large pool of potential descriptors for relevant ones can lead to chance correlations (correlations that arise by chance because so many descriptors have been tried in models). In other words, if many random numbers are generated as potential descriptors (which clearly do not contain any useful molecular information), and various subsets of these are used to build models, apparently significant models can arise by chance.

The earliest method of variable selection used stepwise regression. This was integrated with the model-building process and involved stepwise addition (or backwards elimination) of descriptors according to a statistical test, to find the best model. Another widely used variable reduction method is principle components analysis (PCA). This involves creating a smaller set of new orthogonal descriptors from linear combinations of the original descriptors and using these to generate QSAR models.

The 2D- Versus 3D-QSAR Approach
Studies show that assumption of superiorness of 3D approaches to 2D in drug design may not always hold. For example, the results of conventional CoMFA may often be non-reproducible due to dependence of the outputs' quality on the orientation of the rigidly aligned molecules on user's terminal [59, 60]. Such alignment problems are typical in 3D approaches and even though some solutions have been proposed, the unambiguous 3D alignment of structurally diverse molecules remains a difficult task. Moreover, the distinction between 2D- and 3D-QSAR approaches is not a crisp one, especially when alignment independent descriptors are considered. This can be observed when comparing the BCUT with the WHIM descriptors. Both employ a similar algebraic method, i.e., solving an Eigen problem for a matrix describing the compound the connectivity matrix in case of BCUT descriptors and covariance matrix of 3D co-ordinates in case of WHIM. There is also a deeper connection between 3D-QSAR and one of 2D methods, the topological approach. It stems from the fact that the geometry of a compound often depends on its topology. An elegant example was provided by Estrada et al., who demonstrated that the dihedral angles of biphenyl as a function of the substituents attached to it can be predicted by topological indices [61]. Along the same line, a supposedly typically 3D property, chirality, has been predicted using chiral topological indices [62], constructed by introducing an adequate weight into the topological matrix for the chiral carbons.

We have discussed traditional QSAR, now termed 2D-QSAR and 3D-QSAR. The QSAR relations were determined using logic about important parameters and had to rely on statistical correlations of structural descriptors with biological activities. There were some significant limitations with using the QSAR relations developed to predict better drugs. Today, 3D-QSAR uses model computational methods and technologies such as statistical correlation, machine learning, and 3D visualization, yielding rational for computer assisted drug design. There are software products that enable the design and construction of the QSAR models in an optimal fashion.

QSAR has been applied extensively and successfully over several decades to find predictive models for activity of bioactive agents. It has also been applied to areas related to discovery and subsequent development of bioactive agents: distinguishing drug-like from non-drug like molecules [63], drug resistance [64], toxicity prediction [65-70], physicochemical properties prediction (e.g. water solubility, lipophilicity) [71], gastrointestinal absorption [72], activity of peptides [73], data mining [74], drug metabolism [75] and prediction of other pharmacokinetic and ADME properties [76, 77]. Clearly the number of potential applications for structure-property modeling, in the most general case, is extensive and growing daily. Improved molecular descriptors, based on a better understanding of which molecular attributes are most important for a given property being modeled, and increasing use of genetic and artificial intelligence methods will raise QSAR to even greater levels of usefulness than the current high level.

Despite of several limitations, it has a now been globally apprehended by the contemporary drug discovery community that QSAR, based on well-established principals of statistics, is intrinsically a valuable and viable medicinal chemistry tool whose application domain range from explaining the structure-activity relationships quantitatively and retrospectively, to endowing synthetic guidance leading to logical and experimentally testable hypotheses. The development of QSARs in last 40 years has evolved both in terms of descriptor generation and data analysis. It involves the mathematical and statistical analysis of SAR-data which helps to reduce the number of educated guesses in molecular modification.QSAR is thus a scientific achievement and an economic necessity to reduce empiricism in drug design to ensure that every drug synthesized and pharmacologically tested should be as meaningful. A basic understanding of QSAR concepts is essential for most people, across a diverse range of skills.

[1] Grover M, Singh B, Bakshi M, Singh S. Pharm. Sci. Technol. Today 2000; 3: 28.
[2] Grover M, Singh B, Bakshi M, Singh S. Pharm. Sci. Technol. Today 2000; 3: 50.
[3] Crum-Brown A, Fraser TR. Trans. R. Soc. Edinburg 1868-9; 25: 151.                         
[4] Balbes LM, Mascarella SW, Boyd DB. vol. 5, pp. 337-378: In; Reviews in Computational Chemistry, Lipkowitz KB, Boyd DB, Editors, VCH Publishers, Inc. New York, 1994.
[5] Peet NP. Mod. Drug Discov 2000, March, 21.
[6] Parril AL, Reddy MR (Eds.). Rational Drug Design: Novel Methodology and Practical Applications. ACS: Washington, 1999.
[7] Devlin JP (Ed.). High Throughput Screening. Marcel Dekker: New York 2000.
[8] Kniaz D. Mod. Drug. Discov 2000; 67.
[9] Ladd B. Mod. Drug. Discov 2000, January/February, 46.
[10] Bleicher KH, Boehm HJ, Mueller K, Alanine AI. Nat. Rev. Drug Discov 2003, 2, 369-378.
[11] Gershell LJ, Atkins JH. Nat. Rev. Drug Discov 2003; 2: 321-327.
[12] Goodnow R, Guba W, Haap W. Comb. Chem. High Throughput Screen 2003; 6: 649-660.
[13] Walters WP, Stahl MT, Murcko MA. Drug Discovery Today 1998; 3: 160.
[14] Venkatesh S, Lipper RA. J. Pharm. Sci 2000; 89: 145.
[15] Hann M, Green R. Curr. Opin. Chem. Biology 1999; 3: 379.
[16] Hann M, Green R. Curr. Opin. Chem. Biology 1999; 3: 379.
[17] Van de Waterbeemd H, Carter RE, Grassy G, Kubinyi H, Martin YC, Tute MS, Willett P. Ann. Rep. Med. Chem 1998; 33: 397.
[18] Ooms F. Curr. Med. Chem 2000; 7: 141.
[19] Hansch C, Fujita TP. A Method for the correlation of biological activity and chemical structure. J. Am. Chem. Soc 1964; 86: 1616-1626.
[20] Warne MA, Nicholson JK. Quantitative structure-activity relationships (QSAR’s) in environmental research. Part II. Molecular orbital approaches to property calculation. Prog. Environ. Sci 2000; 2 (1): 31-52.
[21] Karelson M, Lobanov VS, Katritzky AR. Quantum-chemical descriptors in QSAR/QSPR studies. Chem. Rev 1996; 96 (3): 1027-1043.
[22] Carbo-Dorca R, Amat L, Besalu E. Quantum mechanical origin of QSAR: theory and applications. Theochem 2000; 504: 181-228.
[23] Andrews PR, Craik DJ, Martin JL. Functional group contributions to drug-receptor interactions. J. Med. Chem 1984; 27: 1648- 1657.
[24] Tong W, Lowis DR, Perkins R. Evaluation of quantitative structure-activity relationship methods for large-scale prediction of chemicals binding to the estrogen receptor. J. Chem. Inf. Comput. Sci 1998; 38: 669-677.
[25] Winkler DA. Holographic QSAR of benzodiazepines. Quantitative structure-activity relationship 1998; 17: 224.
[26] Labute P. A widely applicable set of molecular descriptors. J. Mol. Graph. Mod 2000, 18: 464-477.
[27] Randic M. On computation of optimum parameters for multivariate analysis of structure-property relationship. J. Comp. Chem 1991; 12 (8): 970-980.
[28] Balaban AT. A personal view topological indices for QSAR/QSPR. QSPR/QSAR Stud. Mol. Descriptors 2001; 1-30.
[29] Devillers J. New trends in QSAR modeling with topological indices. Curr. Opin. Drug Discovery Development 2000; 3 (3): 275-279.
[30] Estrada E. Novel strategies in the search of topological indices. Topol. Indices Relat. Descriptors QSAR/ QSPR 1999; 403-453.
[31] Bonchev D. Overall connectivity and topological complexity. A new tool for QSPR/QSAR. Topol. Indices Relat. Descriptors QSAR/ QSPR 1999; 361-401.
[32] Randic M.  Journal of American Chemical Society 1975; 97: 6609-6615.
[33] Hall LH, Kier LB. J. Pharm. Science 1977; 66: 642.
[34] Kier LB, Hall MH. J. Pharm. Science 1983; 72: 1170.
[35] Kier LB, Hall LH. Molecular Connectivity in Chemistry and Drug Research. Academic Press, New York/London 1976.
[36] Hall LH, Kier LB. J. Pharm. Sci. 1975; 64: 1978.
[37] Burden FR. A chemically intuitive molecular index based on Eigen values of a modified adjacency matrix. J. Chem. Inf. Comput. Science 1997; 16: 309-314.
[38] Stanton DT. Evaluation and use of BCUT descriptors in QSAR and QSPR studies. J. Chem. Inf. Comput. Science 1999; 39: 11-20.
[39] Randic M, Vracko M, Novic M. Eigen values as molecular descriptors QSPR/ QSAR Stud. Mol. Descriptors 2001; 147-211.
[40] Fassihi A, Sabet R. QSAR Study of p56lck Protein Tyrosine Kinase Inhibitory Activity of Flavonoid Derivatives Using MLR and GA-PLS. Int. J. Mol. Science 2008; 9: 1876-1892.
[41] Guner OF. Curr. Top. Med. Chem., 2002; 2: 1321-1332.
[42] Akamatsu M. Curr. Top. Med. Chem 2002; 2: 1381-1394.
[43] Cramer RD, Patterson DE, Bunce JDJ. Am. Chem. Soc 1988; 110: 5959-5967.
[44] Hopfinger AJ, Tokarski JS. Three-Dimensional Quantitative Structure-Activity Relationship Analysis, pp. 105-164: In; Practical Application of Computer-Aided Drug Design, Charifson PS, Editor, Marcel Dekker, Inc.: New York, USA, 1997.
[45] Oprea TI. 3D QSAR Modeling in Drug Design, pp. 571-616: In; Computational Medicinal Chemistry for Drug Discovery, Bultinck P, Winter HD, Langenaeker W, Tollenaere JP. Editors, Marcel Dekker, Inc.: New York, USA, 2004.
[46] Kim KH. Comparative molecular field analysis (CoMFA), 291-331: In; Molecular Similarity in Drug Design, Dean PM. Editor, Blackie Academic & Professional: Glasgow, UK, 1995.
[47] Kim KH. List of CoMFA References, 3: 316-338: In; 3D QSAR in Drug Design-Recent Advances, Kubinyi, H.; Folkers G, Martin YC. Editors, Kluwer Academic Publishers: New York, USA, 1998.
[48] Coats, E.A. The CoMFA Steroids as a Benchmark Dataset for Development of 3D QSAR Methods, 3: 199-213: In; 3D QSAR in Drug Design Recent Advances, Kubinyi H, Folkers G, Martin YC. Eds.; Kluwer Academic Publishers: New York, USA, 1998.
[49] Klebe G, Abraham U, Mietzner TJ. Med. Chem 1994; 37: 4130-4146.
[50] Silverman BD, Platt DE. J. Med. Chem 1996; 39: 2129-2140.
[51] Todeschini R, Lasagni M, Marengo EJ. Chemom 1994; 8: 263-272.
[52] Todeschini R, Gramatica P. Perspect. Drug Discov. Des 1998; 9-11: 355-380.
[53] Bravi G, Gancia E, Mascagni P, Pegna M, Todeschini R, Zaliani AJ. Comput. Aided Mol. Des 1997; 11: 79-92.
[54] Pastor M, Cruciani G, McLay I, Pickett S, Clementi SJ. Med. Chem 2000; 43: 3233-3243.
[55] Cruciani G, Crivori P, Carrupt PA, Testa B J. Mol. Struct.: THEOCHEM 2000; 503: 17-30.
[56] Crivori P, Cruciani G, Carrupt PA, Testa B J. Med. Chem 2000; 43: 2204-2216.
[57] Topliss JG, Edwards RP. Chance factors in studies of quantitative structure-activity relationships. J. Med. Chem 1979; 22 (10): 1238-1244.
[58] Manallack DT, Livingstone DJ. Artificial neural networks: Application and chance effects for QSAR data analysis. Med. Chem. Res 1992; 2: 181-190.
[59] Cho SJ, Tropsha A. J. Med. Chem 1995; 38: 1060-1066.
[60] Cho SJ, Tropsha A, Suffness M, Cheng YC, Lee KH. J. Med. Chem 1996; 39: 1383-1395.
[61] Estrada E, Molina E, Perdomo-Lopez, J. J. Chem. Inf. Comput. Sci 2001; 41: 1015-1021.
[62] De Julian-Ortiz JV, de Gregorio Alapont C, Rios-Santamarina I, Garcia Domenech R, Galvez JJ. Mol. Graphics Modell. 1998; 16: 14-18.
[63] Ajay, Walters WP, Murcko M. Can we learn to distinguish between “drug-like” and “non-drug-like” molecules? J. Med. Chem. 1998; 41: 3314-3324.
[64] Wiese M, Pajeva IK. Structure activity relationships of multidrug resistance reversers. Curr. Med. Chem 2001; 8(6): 685-713.
[65] Schultz TW, Seward JR. Health effects related structure-toxicity relationships- a paradigm for the first decade of the new millennium. Sci. Total Environ 2000; 249 (1-3): 73-84.
[66] Benigni R, Giuliani A, Franke R, Gruska A. Quantitative structure activity relationships of mutagenic and carcinogenic aromatic amines. Chem. Rev 2000; 100 (10): 3697-3714.
[67] Garg R, Karup A, Hansch C. Comparative QSAR: On the toxicology of the phenolic OH moiety. Crit. Rev. Toxicol 2001; 31 (2): 223-245.
[68] Bashir SJ, Maibach HI. Quantitative structure analysis relationships in prediction of skin sensitization potential. Biochem Modulation Skin React 2000; 61-64.
[69] Cronin MTD. Computational methods for the prediction of drug toxicity. Curr. Opin. Drug Discovery Dev 2000; 3 (3): 292-297.
[70] Freidig AP, Hermens JLM. Narcosis and chemical reactivity QSARs for acute fish toxicity. Quant.Struct. Activity Relat 2001; 19 (6): 547-553.
[71] Gombar VK, Enslein K. Assessment of n-octanol/water partition coefficient: When is the assessment reliable? J. Chem. Inf. Comput. Sci 1996; 36 (6): 1127-1134.
[72] Agatonovic-Kustrin S, Beresford R, Yusof A, Pauzi M. Theoretically derived molecular descriptors important in human intestinal absorption. J. Pharm. Biomed. Anal 2001; 25 (2): 227-237.
[73] Brusic, V, Bucci K, Schonbach C. Efficient discovery of immune response targets by cyclical refinement of QSAR models of peptide binding. J. Mol. Graph. Modell 2001; 19 (5): 405-411.
[74] Burden FR, Winkler DA. The computer simulation of high throughput screening of bioactive molecules. Mol. Model. Predict. Bioact. [Proceedings of the 12th European symposium on Quantitative structure activity relationships], 175-180.
[75] Lewis DFV. Structural characteristics of human P450s involved in drug metabolism: QSARs and lipophilicity profiles. Toxicology 2000; 144 (1-3): 197-203.
[76] Vedani A, Dobler M. Multidimensional QSAR in drug research: predicting binding affinities, toxicity and pharmacokinetic parameters. Prog. Drug Res 2000; 55: 105-135.
[77] Guba W, Cruciani G. Molecular field- derived descriptors for the multivariate modeling of pharmacokinetic data. Mol. Model. Predict. Bioact. [Proceedings of the 12th European symposium on Quantitative structure activity relationships], 89-94.



Subscribe to PharmaTutor Alerts by Email



Subscribe to RSS headline updates from:
Powered by FeedBurner