Joseph George Caldwell, PhD (Statistics)

1432 N Camino Mateo, Tucson, AZ 85745 USA

Tel. (001)(520)222-3446, e-mail jcaldwell9@yahoo.com

26 January 2018

Database Design for Monitoring and Evaluation

Copyright © 2018 Joseph George Caldwell.  All rights reserved.

The activities of program and project monitoring and evaluation require the collection, storage and processing of data about project activities, participant characteristics, and results.  If the data collection is based on statistical sample survey, the data are usually stored and processed in a statistical software system such as Stata, SAS, or SPSS.  Otherwise, the data are usually stored and processed in a database management system (DBMS) or management information system (MIS), such as Microsoft Access Relational Database Management System (RDBMS) or ArcView Geographic Information System (GIS).

In the course of my statistical consulting practice, I have had frequent opportunities to be involved in both types of data handling.  As examples of my work in the field of database development, I developed the national Personnel Management Information System (PMIS) for the Government of Malawi civil service, the national Education Management Information System (EMIS) for the Government of Zambia Ministry of Education, the Financial Simulation Model (FSM) for the Government of Barbados Ministry of Finance, and a GIS system for determining good locations for bank automated teller machines (ATMs).  For applications in the field of international development, the Microsoft Access database system is often a very good choice.  It is a powerful, mature, widely available and low-cost system.

Unfortunately, as with most technical tools, Microsoft Access is vulnerable to misuse.  I have seen applications where the simple act of data entry brought a state-of-the-art dual-processor minicomputer system to its knees, because the database analysts did not know how to index key variables of data tables.  I have seen applications that could be speeded up by a factor of perhaps 1,000 by the judicious application of indexing, selections, and aggregations prior to the application of table joins.

A principal ingredient in avoiding these problems is the proper application of proven principles of database design (or data modeling).  These include:

1.    Specification of the variables (data items) to be stored in the database.

2.    Specification of the logical relationships among the variables (i.e., construction of an Entity-Relationship (ER) diagram).

3.    Categorization of the variables into tables (columns are variables, rows are records / observations).

4.    Specification of indexes and keys (unique identifiers) for each table.

5.    Normalization of the tables to assure integrity of data updates, insertions and deletions.

6.    Specification of relationships among tables.

7.    Optimization of the performance of database queries by judicious use of data selections and aggregations before performing table joins (merges).  In larger applications, a normalized database may be deliberately denormalized to achieve improved performance.

8.    Construction of standard reports and forms.

When the preceding steps are taken, full advantage may be taken of a Relational Database Management System (RDBMS), such as Microsoft Access.  When a database has been well designed, it can be efficiently and confidently used by data analysts and other users to make fast, reliable queries (data retrievals, reports), and to maintain a database having data integrity.  While a well-designed database may be used for years, the activity of developing (designing and implementing) can often be accomplished relatively quickly.

At the current stage of my consulting career, I specialize in design and development of database systems in applications in international development that can be accomplished within a few weeks or months, using Microsoft Access.  It should be recognized that for most applications, not all database processing can be accomplished simply through the interactive use of standard procedures, such as an automated “query builder” or “report generator.”  To be effective, some applications require significant use of Microsoft Visual Basic (the programming language imbedded in Access).  I have substantial experience in the use of Microsoft Access in database applications (such as the GIS ATM application mentioned, and the development of a variable-rate pricing system for bank loans, based on Lagrangian optimization).

FndID(34)

FndTitle(Database Design for Monitoring and Evaluation)

FndDescription(Database Design for Monitoring and Evaluation)

FndKeywords(database design; monitoring and evaluation; data modelling; third normal form; indexing; select, then aggregate, then join)