Skip to main content


Pharma courses


pharma courses



About Authors
*Shree Swaminarayan Sanskar Collage of Pharmacy Zundal, Gandhinagar, Gujarat,
Email Id:,
FTF Pharma Pvt. Ltd. – Ahmedabad, Email ID:

Abstract :
Clinical Data Management (CDM) is a vital phase in clinical research, which leads to establishment of highquality, consistent, and statistically reverberation data from clinical trials.They should have adequate method knowledge that helpsmaintain the quality standards of Clinical Data Management processes. Various procedures in Clinical Data Management including Case Report Form designing, annotation, database designing, dataentry,data validation, discrepancy management,medical coding, data extraction, and database locking are assessed for quality at regular intervals during atrial. In the present scenario, there is an increased demand to recover the Clinical Data Management standards to meet theregulatory requirements and stay ahead of the competition by means of faster commercialization of product.With the implementation of regulatory compliant data management tools, CDM team can meet thesedemands. Clinical Data Managementprofessionals should meet appropriate expectations and set standards for data quality and also have a drive toadapt to the rapidly changing technology. This article highlights the processes involved and provides thereader an overview of the tools and standards adopted as well as the roles and responsibilities in Clinical Data Management. Clinical Data Management needs to draw on a broad range of skills such as technical, scientific, project management, information technology, systems engineeringand provide valued service in managing data within the anticipated e-clinical age.

Clinical data management (CDM) is a vital cross-functional vehicle in clinical trials to ensure high-quality data are captured by sites staff through paper case report form (CRF) or electronic case report form and available for early review. The integrity and quality of data being collected and transferred from study subjects to a clinical data management system (CDMS) must be monitored, maintained, and quantified to ensure a reliable and effective base for not only new drug application (NDA) submission and clinical science reports, but also corporate clinical planning, decision-making, process improvement, and operational optimization.The gradually increasing use of electronic data-capturing (EDC) technology and electronic CRF to collect data in clinical trials has grown in recent years and has affected the activities of clinical research operations for industry sponsors, contract research organizations (CROs), and clinical sites.1–3 EDC technology must comply with applicable regulatory requirements and offerflexible, configurable, scalable, and auditable system features.4 Transitioning from paper-based data collection (PDC) to EDC systems has produced many benefits, that is easing the burden associated with organizing paper CRF work and greatly reducing the time, cost, and stress required in bringing a product to market through technology-enabled efficiency improvement, such as the quick and robust interactive voice response system (IVRS) supported and integrated auto casebook creation, early data availability, and fast database lock via Internet-baseduser interface. Although EDC technologies offer advantages over traditional paper-based systems, collecting, monitoring, coding, reconciling, and analyzing clinical data. Often from multiple sources, can be challenging.

A. GCP Guidelines
• All clinical research data should be recorded, handled, & stored in a way that allows its accurate reporting, interpretation & verification. (ICH GCP 2.10, 4.9, 5.5, 5.14 & ICH E9  3.6 & 5.8)
• Systems with procedures that assure quality of every aspect of research should be implemented. (GCP 2.13)
• Quality assurance & quality control systems with written standard operating procedures should be implemented & maintained to ensure that research are conducted & data are generated, documented & recorded, & reported in compliance with protocol, GCP & applicable regulatory requirements. (GCP 5.1.1)
• If data are transformed during processing, it should always be possible to compare original data & observations with processed data (ICH GCP 5.5.4)
• Sponsor should use an unambiguous subject identification number or code that allows identification of all data reported for each subject. (ICH GCP 5.5.5)
• Protocol amendments that necessitate a change in design of CRF, subject diaries, study worksheets, research database & other key aspects of CDM processes need to be controlled. (ICH E9 2.1.2)
• Common standards should be adopted for a number of features of research such as dictionaries of medical terms, definition & timing of main measurements, handling of protocol deviations. (ICH E9 2.1.1)

B. Tools for Clinical Data Management
Many software tools are available for data management, and these are called Clinical Data ManagementSystems (CDMS). In multicentric trials, a CDMS has become essential to handle the huge amount of data.Most of the CDMS used in pharmaceutical companies are commercial, but a few open source tools areavailable as well. Commonly used CDM tools are ORACLE CLINICAL, CLINTRIAL, MACRO, RAVE,and eClinical Suite. In terms of functionality, these software tools are more or less similar and there is nosignificant advantage of one system over the other. These software tools are expensive and needsophisticated Information Technology infrastructure to function. Additionally, some multinationalpharmaceutical giants use custommadeCDMS tools to suit their operational needs and procedures. Amongthe open source tools, the most prominent ones are OpenClinica, openCDMS, TrialDB, and PhOSCo. These CDM software are available free of cost and are as good as their commercial counterparts in terms of functionality. These open source software can be downloaded from their respective websites.
In regulatory submission studies, maintaining an audit trail of data management activities is of paramountimportance. These CDM tools ensure the audit trail and help in the management of discrepancies. According to the roles and responsibilities (explained later), multiple user IDs can be created with access limitation to data entry, medical coding, database designing, or quality check. This ensures that each user can access only the respective functionalities allotted to that user ID and cannot make any other change in the database. For responsibilities where changes are permitted to be made in the data, the software will record the change made, the user ID that made the change and the time and date of change, for audit purposes (audit trail). During a regulatory audit, the auditors can verify the discrepancy management process; the changes made and can confirm that no unauthorized or false changes were made.

C. Regulations, Guidelines, and Standards in CDM
Akin to other areas in clinical research, CDM has guidelines and standards that must be followed. Since the pharmaceutical industry relies on the electronically captured data for the evaluation of medicines, there is a need to follow good practices in CDM and maintain standards in electronic data capture. These electronic records have to comply with a Code of Federal Regulations (CFR), 21 CFR Part 11. This regulation is applicable to records in electronic format that are created, modified, maintained, archived, retrieved, or transmitted. This demands the use of validated systems to ensure accuracy, reliability, and consistency of data with the use of secure, computergenerated, timestamped audit trails to independently record the date and time of operator entries and actions that create, modify, or delete electronic records.[3] Adequate procedures and controls should be put in place to ensure the integrity, authenticity, and confidentiality of data. If data have to be submitted to regulatory authorities, it should be entered and processed in 21 CFR part 11compliant systems. Most of the CDM systems available are like this and pharmaceutical companies as well as contract research organizations ensure this compliance.
Society for Clinical Data Management (SCDM) publishes the Good Clinical Data Management Practices(GCDMP) guidelines, a document providing the standards of good practice within CDM.GCDMP wasinitially published in September 2000 and has undergone several revisions thereafter. The July 2009 version is the currently followed GCDMP document. GCDMP provides guidance on the accepted practices in CDM that are consistent with regulatory practices. Addressed in 20 chapters, it covers the CDM process by highlighting the minimum standards and best practices. Clinical Data Interchange Standards Consortium (CDISC), a multidisciplinary nonprofit organization, has developed standards to support acquisition, exchange, submission, and archival of clinical research data andmetadata. Metadata is the data of the data entered. This includes data about the individual who made theentry or a change in the clinical data, the date and time of entry/change and details of the changes that have been made. Among the standards, two important ones are the Study Data Tabulation Model Implementation Guide for Human Clinical Trials (SDTMIG) and the Clinical Data Acquisition Standards Harmonization (CDASH) standards, available free of cost from the CDISC website.

A. Review and finalization of study documents
During thisreview, the CDM personnel will identify the data items to be collected and the frequency of collection with respect to the visit schedule. A Case Report Form (CRF) is designed by the CDM team, as this is the first step in translating the protocol specific activities into data being generated. The data fields should be clearly defined and be consistent throughout. The type of data to be entered should be evident from the CRF. The CRF should be concise, self-explanatory and userfriendly (unless you are the one entering data into the CRF). Along with the CRF, the filling instructions (called CRF Completion Guidelines) should also be provided to study investigators for errorfree data acquisition. CRF annotation is done wherein the variable is named according to the SDTMIG or the conventions followed internally. Annotations are coded terms used in CDM tools to indicate the variables in the study.
DMP document is a road map to handle the data under foreseeable circumstances and describes the CDM activities to be followed in the trial. The DMP describes the database design, data entry and data tracking guidelines, quality control measures, SAE reconciliation guidelines, discrepancy management, data transfer/extraction, and database locking guidelines. Along with the DMP, a Data Validation Plan (DVP) containing all editchecks to be performed and the calculations for derived variables are also prepared. The edit check programs in the DVP help in cleaning up the data by identifying the discrepancies.

B.  Database designing
Databases are the clinical software applications, which are built to facilitate the CDM tasks to carry outmultiple studies. Generally, these tools have builtincompliance with regulatory requirements and are easyto use. “System validation” is conducted to ensure data security, during which system specifications, userrequirements, and regulatory compliance are evaluated before implementation. Study details like objectives,intervals, visits, investigators, sites, and patients are defined in the database and CRF layouts are designed fordata entry. These entry screens are tested with dummy data before moving them to the real data capture.

C.  Data collection
Data collection is done using the CRF that may exist in the form of a paper or an electronic version. Thetraditional method is to employ paper CRFs to collect the data responses, which are translated to the databaseby means of data entry done in-house.These paper CRFs are filled up by the investigator according to thecompletion guidelines. In the eCRFbasedCDM, the investigator or a designee will be logging into theCDM system and entering the data directly at the site. In eCRFmethod, chances of errors are less, and theresolution of discrepancies happens faster. Since pharmaceutical companies try to reduce the time taken fordrug development processes by enhancing the speed of processes involved, many pharmaceutical companiesare opting for eCRFoptions (also called remote data entry).

D.  CRF tracking
The entries made in the CRF will be monitored by the Clinical Research Associate (CRA) for completenessand filled up CRFs are retrieved and handed over to the CDM team. The CDM team will track the retrievedCRFs and maintain their record. CRFs are tracked for missing pages and illegible data manually to assurethat the data are not lost. In case of missing or illegible data, a clarification is obtained from the investigatorand the issue is resolved.

E.  Data entry
Data entry takes place according to the guidelines prepared along with the DMP. This is applicable only inthe case of paper CRF retrieved from the sites. Usually, double data entry is performed wherein the data isentered by two operators separately. The second pass entry (entry made by the second person) helps inverification and reconciliation by identifying the transcription errors and discrepancies caused by illegibledata. Moreover, double data entry helps in getting a cleaner database compared to a single data entry. Earlierstudies have shown that double data entry ensures better consistency with paper CRF as denoted by a lessererror rate.[9]

F.  Data validation
Data validation is the process of testing the validity of data in accordance with the protocol specifications.Edit check programs are written to identify the discrepancies in the entered data, which are embedded in thedatabase, to ensure data validity. These programs are written according to the logic condition mentioned inthe DVP. These edit check programs are initially tested with dummy data containing discrepancies.Discrepancy is defined as a data point that fails to pass a validation check. Discrepancy may be due toinconsistent data, missing data, range checks, and deviations from the protocol. In eCRFbased studies, datavalidation process will be run frequently for identifying discrepancies. These discrepancies will be resolvedby investigators after logging into the system. Ongoing quality control of data processing is undertaken atregular intervals during the course of CDM. For example, if the inclusion criteria specify that the age of thepatient should be between 18 and 65 years (both inclusive), an edit program will be written for twoconditions viz. age <18 and >65. If for any patient, the condition becomes TRUE, a discrepancy will begenerated. These discrepancies will be highlighted in the system and Data Clarification Forms (DCFs) canbe generated. DCFs are documents containing queries pertaining to the discrepancies identified.

G.  Discrepancy management
This is also called query resolution. Discrepancy management includes reviewing discrepancies, investigating the reason, and resolving them with documentary proof or declaring them as irresolvable.Discrepancy management helps in cleaning the data and gathers enough evidence for the deviationsobserved in data. Almost all CDMS have a discrepancy database where all discrepancies will be recordedand stored with audit trail.Based on the types identified, discrepancies are either flagged to the investigator for clarification or closed inhouseby SelfEvidentCorrections (SEC) without sending DCF to the site. The most common SECs areobvious spelling errors. For discrepancies that require clarifications from the investigator, DCFs will be sentto the site. The CDM tools help in the creation and printing of DCFs. Investigators will write the resolutionor explain the circumstances that led to the discrepancy in data. When a resolution is provided by theinvestigator, the same will be updated in the database. In case of eCRFs,the investigator can access thediscrepancies flagged to him and will be able to provide the resolutions online.
The CDM team reviews all discrepancies at regular intervals to ensure that they have been resolved. Theresolved data discrepancies are recorded as ‘closed’. This means that those validation failures are no longerconsidered to be active, and future data validation attempts on the same data will not create a discrepancy forsame data point. But closure of discrepancies is not always possible. In some cases, the investigator will notbe able to provide a resolution for the discrepancy. Such discrepancies will be considered as ‘irresolvable’and will be updated in the discrepancy database.Discrepancy management is the most critical activity in the CDM process. Being the vital activity in cleaningthe data, utmost attention must be observed while handling the discrepancies.

H. Medical coding
Medical coding helps in identifying and properly classifying the medical terminologies associated with theclinical trial. For classification of events, medical dictionaries available online are used. Technically, thisactivity needs the knowledge of medical understanding of disease entities, drugs used, and abasic knowledge of the pathological processes involved. Functionally, it also requires knowledge about the structure of electronic medical dictionaries and the hierarchy of classifications available in them. Adverse events occurring during the study, prior to and concomitantly administered medications and prior coexisting illnesses are coded using the available medical dictionaries. Commonly, Medical Dictionary for Regulatory Activities (MedDRA) is used for the coding of adverse events as well as other illnesses and World Health Organization–Drug Dictionary Enhanced (WHODDE) is used for coding the medications. These dictionaries contain the respective classifications of adverse events and drugs in proper classes. Other dictionaries are also available for use in data management (eg.WHO-ART)is a dictionary that deals with adverse reactions terminology). Some pharmaceutical companies utilize customized dictionaries to suit their needs and meet their standard operating procedures.Medical coding helps in classifying reported medical terms on the CRF to standard dictionary terms in order to achieve data consistency and avoid unnecessary duplication. For example, the investigators may use different terms for the same adverse event, but it is important to code all of them to a single standard code and maintain uniformity in the process. The right coding and classification of adverse events and medication is crucial as an incorrect coding may lead to masking of safety issues or highlight the wrong safety concerns related to the drug.

I.  Database locking
After a proper quality check and assurance, the final data validation is run. If there are no discrepancies, the SAS datasets are finalized in consultation with the statistician. All data management activities should have been completed prior to database lock. To ensure this, a prelock checklist is used and completion of all activities is confirmed. This is done as the database cannot be changed in any manner after locking. Once the approval for locking is obtained from all stakeholders, the database is locked and clean data is extracted for statistical analysis. Generally, no modification in the database is possible. But in case of a critical issue or for other important operational reasons, privileged users can modify the data even after the database is locked. This, however, requires proper documentation and an audit trail has to be maintained with sufficient justification for updating the locked database. Data extraction is done from the final database after locking. This is followed by its archival.

In a CDM team, different roles and responsibilities are attributed to the team members. The minimumeducational requirement for a team member in CDM should be graduation in life science and knowledge of computer applications. Ideally, medical coders should be medical graduates. However, in the industry,paramedical graduates are also recruited as medical coders. Some key roles are essential to all CDM teams. The list of roles given below can be considered as minimum requirements for a CDM team:
• Data Manager
• Database Programmer
• Designer
• Medical Coder
• Clinical Data Coordinator
• Quality Control Associate
• Data Entry Associate

The data manager is responsible for supervising the entire CDM process. The data manager prepares theDMP, approves the CDM procedures and all internal documents related to CDM activities. Controlling and allocating the database access to team members is also the responsibility of the data manager. The database programmer/designer performs the CRF annotation, creates the study database, and programs the edit checks for data validation. He/she is also responsible for designing of data entry screens in the database and validating the edit checks with dummy data. The medical coder will do the coding for adverse events, medical history, co-illnesses, and concomitant medication administered during the study. The clinical data coordinator designs the CRF, prepares the CRF filling instructions, and is responsible for developing the DVP and discrepancy management. All otherCDrelated documents, checklists, and guideline documents are prepared by the clinical data coordinator. The quality control associate checks the accuracy of data entry and conducts data audits. Sometimes, there is a separate quality assurance person to conduct the audit on the data entered. Additionally, the quality control associate verifies the documentation pertaining to the procedures being followed. The data entry personnel will be tracking the receipt of CRF pages and performs the data entry into the database.

Technology-driven strategies and initiatives have the potential to alleviate the significant pressure to market amedicine as early in the patent life as possible to maximize the period without competition, both to increase total revenue and to shorten the time to market sales. The increase in regulatory requirements and competition seen in therecent years, coupled with reforms in health care services, has presented extreme challenges for the biopharmaceutical industry, suggesting the need for sponsor companies to invest significantly in technological solutions and add an additional emphasis on business process re-engineering and improvement to engender long-term clinical efficiencies and cost benefits. In this environment, the effectiveness of the clinical data management function is crucial to substantiate early approval for a new product launch and subsequent successful marketing. Delay, deficiency, or quality issues in the CDM process can be costly. Further, speed is not enough by itself and success needs to be achieved with other quality attributes. There is an ever-increasing demand for sponsors,including CROs, to strike the right balance between time, cost, process, and quality in conducting all clinical studies.
Applying e-clinical technology, including EDC, in such a context is the anticipated industry trend and will continue to offer superior benefits to sponsors as collaboration, standardization initiatives, and technology innovation are constantly geared towards more and wider technology adoption.

A. Clinical data management from industry perspectives
Through participation with the team during the design ofthe study, the data manager or study designer gains thenecessary understanding of the required data from theprotocol and the standards expected with respect to dataquality. It is important for data managers or study designersto understand the varied sources of the data and the form inwhich the data will be retrieved, i.e., hospital records, laboratorytest results, insurance and government records, privatephysician records, or e-diaries/patient-reported outcomes. Itis increasingly recognized that the design of the CRF oreCRF is a key quality step in ensuring the data required bythe protocol, regulatory compliance and/or safety needs/comments, study scientific-specific hypothesis attributes,site work flow, and cross-checking of data items within aform or across different forms are addressed. CRF designis an interdisciplinary system engineering process requiringnot only technical skills in utilizing the information technology(IT) tools but also expertise and scientific reasoning inthe subject therapeutic areas. The original materials for thiscritical design are the draft yet stable clinical protocol, thecorporate therapeutic unit standard forms, and clinical dataacquisition standards harmonization (CDASH) guidelines.Such systems engineering work requires cross-functionalteam collaboration and input. It is mission critical that allfunctional teams including science, safety, biostatistics,regulatory compliance, and IT are represented in formreview meetings and their feedback is incorporated intothe revised and finalized forms. Systems developmentmethodology and controlled process are followed for eCRFdesign and development to ensure regulatory requirementsare met.Additionally, form design must always be tailoredto the majority of end users and have their work flow takeninto account. Any potential ambiguity in the CRF or eCRFmust be avoided. In today’s clinical research, the conceptsand definitions are reasonably standardized. For each study,the definition of clinical terms, data entry guidelines, anddata handling conventions require intensive effort and communicationamong all members of the study team to assurea meaningful and persistent set of data is compiled. Suchinformation should be incorporated into written guidelinesfor CRF or eCRF completion. The use of the CRFs andguidelines should be thoroughly tested and reviewed by apilot use at least among clinical data management or verificationstaff. Data edits such as ranges and cross-checksshould be established with the participation of CDM, monitoringpersonnel, and scientists. This is especially importantwith EDC studies because the majority of such edit checksimpact how queries will be issued and resolved.

The competitive pressure in today’s marketplace is forcing the biopharmaceutical industry to seek better ways of reducing drug development times and increasing productivity. The market acceptance of EDC technology has fueled new demands for improvement, configurability, and intelligent features. The need to improve clinical efficiencies and accelerate study times continues to grow, driving industry sponsors to seek an e-clinical environment that provides and promotes flexible eCRF trial design, build, and speedy deployment, robust data management, real-time data visibility, reporting and analysis, and global trial management and study scalability.10 Shortening the clinical trial lifecycle by collecting quality data more quickly and accelerating the availability of data are solutions to a critical path bottleneck that the industry has been working on for many years. Adopting EDC technology and e-clinical systems in the clinical trial process offers a solution with some claimed success stories.

To meet the expectations, there is a gradual shift from the paper based to the electronic systems of data management. Developments on the technological front have positively impacted the CDM process and systems, thereby leading to encouraging results on speed and quality of data being generated. At the same time, CDM professionals should ensure the standards for improving data quality.[11] CDM, being a speciality in itself, should be evaluated by means of the systems and processes being implemented and the standards being followed. The biggest challenge from the regulatory perspective would be the standardization of data management process across organizations, and development of regulations to define the procedures to be followed and the data standards. From the industry perspective, the biggest hurdle would be the planning and implementation of data management systems in a changing operational environment where the rapid pace of technology development outdates the existing infrastructure.

1. N. G. Forouhi and N. J. Wareham, “Epidemiology of diabetes,” Medicine,2019; 47(1): 22–27.
2. F. M. Puchulu, “Definition, diagnosis and classification of diabetes mellitus,” in Dermatology and Diabetes, Springer Cham,2018; 7–18.
3. N. H. Cho, J. E. Shaw, S. Karuranga et al., “IDF Diabetes Atlas: global estimates of diabetes prevalence for 2017 and projections for 2045,” Diabetes Research and Clinical Practice, 2018; 138:271–281.
4. H. Malekzadeh, F. Hadaegh, and D. Khaili, “High blood pressure, FBS and LDL-C among diabetic Iranians,”  Iranian Journal of Endocrinology and Metabolism,2020; 21(5):281–286, 2020.
5. Gerritsen MG, Sartorius OE, Veen FM, Meester GT. Data management in multicenterclinical trials  and the role of a nationwide computer network. Med Care.1993;659–62.  
6. Lu Z, Su J. Clinical data management: Current status, challenges, and future directions from industry perspectives. J Clin Trials.2010, 2:93–105.
7. CFR Code of Federal Regulations Title 21 [Internet] Maryland: Food and Drug Administration.[Updated  2010 Apr 4; Cited 2011 Mar 1].
8. Study Data Tabulation Model [Internet] Texas. 2011.Clinical Data Interchange Standards Consortium.  C 2011. [Updated 2007 Jul; Cited 2011 Mar 1]. Available from:
9. Fegan GW, Lang TA. 2008. Could an open source clinical trial datamanagementsystem be what we have all been looking for? PLoS Med. 2008, 5: 6. [PMCID: PMC2267809] [PubMed: 18318594]
10.  Kuchinke W, Ohmann C, Yang Q, Salas N, Lauritsen J, Gueyffier F, et al. Heterogeneity prevails :  Thestate of clinical trial data management in Europe results of a survey of ECRIN centres. Trials. 2010, 11:79.[PMCID: PMC2918594] [PubMed: 20663165]    
11. Haux R, Knaup P, Leiner F. On educating about medical data management the other side of the electronic  health record. Methods Inf Med. 2007; 46:74–9. [PubMed: 17224986]
12. Pavlovic I, Kern T, Miklavcic D. Comparison of paper-based and electronic data collection process  in clinical trials: Costs simulation study.Contemp Clin Trials. 2009;30:300–316.
13. Lu ZW. Information technology in pharmacovigilance: Benefits, challenges, and future directions  from industry perspectives. Drug,Healthcare and Patient Safety. 2009; 1:35–45.
14. EI Emam K, Jonker E, Sampson M, et al. 2009. The use of electronic data capture tools in clinical trials:  Web-survey of 259 Canadian trials. J Med Internet Res.2009;11(8)
15. Lu ZW. 2010. Electronic data-capturing technology for clinical trials: Experiencewith a global post marketing study. IEEE Eng. Med Biol Mag. 2010;29:95–102.
16. Mosley M. 2007. DAMA-DMBOK Guide (data management body of knowledge):Introduction and  Project Status. Availablefrom:
17. Ishigaki D. 2004. Effective Management through Measurement,
Available from:
18. Lu ZW. 2010. Technical challenges in designing post-marketing eCRFs toaddress clinical safety and
Pharmacovigilance needs. Contemp ClinTrials.2010;31:108–18.




Search this website