Author(s)
Amos Golan
Spiro Stefanou

In this study we develop measures of the potential value of information with an emphasis on observed information – data. Though value is a relative concept, developing approximate and applicable measures is essential. Such a measure (or set of measures) allows us to evaluate the potential value of public and privately available datasets, and the value of accessing each. There are several benefits to having such measures. First, providers of data can perform a cost-benefit analysis. Second, policy makers can better determine the benefits of different data when deciding whether to invest in its collection, production and release. The proposed measures are derived from information-theoretic principles as well as other statistics, in conjunction with relative measures based on semantic arguments. These measures are functions of attributes that can be aggregated into three basic blocks: (i) data reliability, integrity and accuracy, (ii) data quality, and (iii) potential value. We provide detailed empirical examples applying these measures to three data sets, each of which is different in context, size and complexity.

Publication Type
Working Paper
File Description
First version, December 28, 2023
JEL Codes
C80: Data Collection and Data Estimation Methodology/Computer Programs: General
D80: Information, Knowledge, and Uncertainty: General
Keywords
benford's law
compressibility
condition number
mutual information
potential value
relative entropy
Shannon entropy
simple statistics
value