Limitations of Aggregate Databases

Aggregate data is information that is gathered from several sources. It provides summary information about the characteristics of the database as a whole but does not look at individual variations. A good example might be health statistics from several counties within a state. The aggregate data may show trends from county to county, but not necessarily among the subjects within each county.
  1. Ecological Fallacy

    • One key limitation of aggregate data is referred to as the ecological fallacy. Researchers may use aggregate data to find average characteristics for a group, but they cannot assume that those average characteristics apply to every member of the group. Since aggregate data typically does not allow easy manipulation of variables, there may be a tendency to overlook influences that may be affecting individuals within the data set. Users of the data must remember that aggregate data only shows averages, not individual attributes.

    Survival Data

    • Survival data, or time-to-event data, is an important component of many health-related studies that look at long-term effects of variables upon individuals. In this type of study, researchers look at subjects more than once over an extended period of time. This is usually easy to do with individual patient data, but often impossible with aggregate data since there may be no way to identify individual subjects of a study. Researchers can make broad generalizations about the sample, but they cannot go back to individual subjects later on. Since aggregate data only refers to a specific instance for each subject, its usefulness for research that has a time lag is limited.

    Individual Variables

    • A major limitation of aggregate data is that researchers cannot use it to study effects of variables at the individual subject level. Researchers may examine the broad effects of prespecified relationships in an aggregate study, but they are not able to explore the relationships of other variables that may become visible in more unit-based databases. By relying on aggregate data, they lose an opportunity to use the data to generate more hypotheses.

    Data Control Issues

    • A final limitation of aggregate data has less to do with its usefulness for research than with its supporting role in large organizations. Collections of aggregate financial and health data have to be kept secure if they have any account or identifying information. The data is only useful if it is accessible, but accessibility can lead to vulnerability to hacking and information theft. The owners of large collections of accessible aggregate data must take extensive precautions to ensure the security of the data.

Learnify Hub © www.0685.com All Rights Reserved