Vine copula statistical disclosure control for mixed-type data

Man Ying Amanda CHU, Chun Yin IP, Benson S.Y. LAM, Mike K.P. SO

Research output: Contribution to journalArticlespeer-review

3 Citations (Scopus)

Abstract

In this paper, we develop a new statistical disclosure control (SDC) method for mixed-type data based on vine copulas. The use of Gaussian and skew-t copulas has been demonstrated to be capable of incorporating information from the marginal distributions of mixed-type variables, whether they are discrete or continuous. In particular, our proposed SDC method using vine copulas generalizes a data perturbation method using an extended skew-t copula. Our vine-SDC method improves the SDC method using the extended skew-t copula by allowing the bivariate copulas in the vine decomposition to take various forms, thus offering a better fit for the joint distribution of the data and more flexibility in data perturbation. An additional advantage of our vine-SDC method is the significant improvement in computational efficiency compared with that using the extended skew-t copula. We discuss some statistical properties of vine copulas and the methodology of vine-SDC. A simulation and a study of real healthcare survey data are provided to explore the performance and strength of vine-SDC and compare it with a common copula-based SDC method. Copyright © 2022 Elsevier B.V. All rights reserved.
Original languageEnglish
Article number107561
JournalComputational Statistics and Data Analysis
Volume176
Early online dateJul 2022
DOIs
Publication statusPublished - Dec 2022

Citation

Chu, A. M. Y., Ip, C. Y., Lam, B. S. Y., & So, M. K. P. (2022). Vine copula statistical disclosure control for mixed-type data. Computational Statistics and Data Analysis, 176. Retrieved from https://doi.org/10.1016/j.csda.2022.107561

Keywords

  • Confidentiality
  • Data perturbation
  • Data privacy
  • Disclosure risk
  • Healthcare analytics

Fingerprint

Dive into the research topics of 'Vine copula statistical disclosure control for mixed-type data'. Together they form a unique fingerprint.