Publications

Document title

Authors

Year

Source

Utility and Disclosure Risk for Differentially Private Synthetic Categorical Data

Raab

2022

Raab, Gillian M. Utility and Disclosure Risk for Differentially Private Synthetic Categorical Data. Lecture Notes in Computer Science. vol. 13463. Cham: Springer International Publishing, 2022. 250--265.

synthpop: Bespoke creation of synthetic data in R

Nowok, Raab, Dibben

2016

Journal of Statistical Software, 74:1-26; DOI:10.18637/jss.v074.i11

synthpop: An R package for generating synthetic versions of sensitive microdata for statistical disclosure control

Nowok

2015

synthpop: Bespoke creation of synthetic data in R

Raab, Nowok, Dibben

2016

Available as a package vignette on the CRAN website

Practical data synthesis for large samples

Raab,  Nowok,  Dibben

2017

Inference from fitted models in synthpop

Raab, Nowok

2017

Preprint currently available as a package vignette on the CRAN website

Providing bespoke synthetic data for the UK Longitudinal Studies and other sensitive data with the synthpop package for R

Nowok, Raab, Dibben

2017

Statistical Journal of the IAOS, 33(3):785-796; DOI: 10.3233/SJI-150153

General and specific utility measures for synthetic data

Snoke, Raab, Nowok, Dibben and Slavkovic.

2018

Journal of the Royal Statistical Society: Series A; DOI: 10.1111/rssa.12358

Guidelines for producing useful synthetic data

Raab, Nowok,  Dibben

2016

Earlier version presented at the Privacy in Statistical Databases Conference 2016; Dubrovnik, Croatia, 14-16 September 2016.

Final report on the disclosure risk associated with the synthetic data produced by the SYLLS team

Elliot

2014

Report 2015-2, Cathie Marsh Centre for Census and Survey Research (CCSR)

Introduction to synthetic data produced with the synthpop package

Raab

2018

Internal report

Utility of synthetic microdata generated using tree-based methods

Nowok

2015

Paper presented at the Privacy in Statistical Databases Conference 2016; Dubrovnik, Croatia, 14-16 September 2016

Recognising real people in synthetic microdata: risk mitigation and impact on utility

Nowok

2017

Paper presented at the Joint UNECE/Eurostat work session on statistical data confidentiality; Skopje, North Macedonia, 20-22 September 2017

Putting synthetic people in place: creating synthetic data for spatial analysis at the individual level

Nowok, Dibben

2017

Technical report for the QCumber-EnvHealth project

Assessing, visualizing and improving the utility of synthetic data

Raab, Nowok,

Dibben

2021

Paper presented at the Joint UNECE/Eurostat expert meeting on statistical data confidentiality; Poznań, Poland, 1-3 December 2021. Available as a package vignette on the CRAN website

Presentations (selected)

If you are looking for a specific presentation which is not listed below, please get in touch and we will send it to you.

Presentation title

Presenter

Date

Event

Synthetic data in Scotland and beyond: lessons learned and future directions

Nowok

2018/06/12

Cathie Marsh Institute for Social Research (CMI) Afternoon Seminar, Manchester

Facilitating access to administrative records with synthetic data

Raab

2017/11/22

Dealing with Data 2017 Conference, University of Edinburgh

Synthetic data in practice: software, applications and challenges

Nowok

2017/09/07

Royal Statistical Society (RSS) 2017 Conference,

Glasgow

Courses

Course details

Files

Learning to create useful synthetic data

Date: 6th September 2022

Place: Edinburgh, workshop at the IDPLN conference

Presenters: G Raab & B Nowok &, supported by L Adair;

Generating synthetic data with the synthpop package for R

Date: 20 June 2018

Place: Belfast, International Conference for Administrative Data Research

Presenters: Nowok & Raab

Session 1: Introducing data synthesis and synthpop

A brief overview of the history of proposals for synthetic data generation and how these have been used in practice. In particular, how synthetic data sets are being made available to users of the Scottish Longitudinal Study. A brief introduction to synthpop and a simple example of data synthesis.

Session 2: Using synthpop

Details of the various functionalities of the synthpop package for R. Real data examples showing how to run default and customized synthesis and how to evaluate quality of synthetic data by visualisation, formal utility measures and comparisons of results of analysis based on original observed data and their synthesised version. Some practical advice on synthesising problematic variables.

Stay connected with us

Enter your email address to receive occasional updates

Submitting...

Something went wrong

Your email has been received

Publications

Document title

Authors, year, source

Utility and Disclosure Risk for Differentially Private Synthetic Categorical Data/p>

Raab 2022, PSD 2022, Paris DOI:10.18637/jss.v074.i11

synthpop: An R package for generating synthetic versions of sensitive microdata for statistical disclosure control

Nowok, 2015, Paper presented at the Joint UNECE/Eurostat work session on statistical data confidentiality; Helsinki, Finland, 5-7 October 2015

synthpop: Bespoke creation of synthetic data in R

Raab, Nowok, Dibben, 2016, Available as a package vignette on the CRAN website

Practical data synthesis for large samples

Raab,  Nowok,  Dibben, 2017, Journal of Privacy and Confidentiality, 7(3):67-97

Inference from fitted models in synthpop

Raab, Nowok, 2017, Preprint currently available as a package vignette on the CRAN website

Providing bespoke synthetic data for the UK Longitudinal Studies and other sensitive data with the synthpop package for R

Nowok, Raab, Dibben, 2017, Statistical Journal of the IAOS, 33(3):785-796; DOI: 10.3233/SJI-150153

General and specific utility measures for synthetic data

Snoke et al., 2018, Journal of the Royal Statistical Society: Series A; DOI: 10.1111/rssa.12358

Guidelines for producing useful synthetic data

Raab, Nowok,  Dibben, 2016, Earlier version presented at the Privacy in Statistical Databases Conference 2016; Dubrovnik, Croatia, 14-16 September 2016.

Final report on the disclosure risk associated with the synthetic data produced by the SYLLS team

Elliot, 2014, Report 2015-2, Cathie Marsh Centre for Census and Survey Research (CCSR)

Introduction to synthetic data produced with the synthpop package

Raab, 2018, Internal report

Utility of synthetic microdata generated using tree-based methods

Nowok, 2015, Paper presented at the Privacy in Statistical Databases Conference 2016; Dubrovnik, Croatia, 14-16 September 2016

Recognising real people in synthetic microdata: risk mitigation and impact on utility

Nowok, 2017, Paper presented at the Joint UNECE/Eurostat work session on statistical data confidentiality; Skopje, Macedonia, 20-22 September 2017

Putting synthetic people in place: creating synthetic data for spatial analysis at the individual level

Nowok, Dibben, 2017, Technical report for the QCumber-EnvHealth project

Presentation title

Presenter, date, event

Synthetic data in Scotland and beyond: lessons learned and future directions

Nowok, 2018/06/12, Cathie Marsh Institute for Social Research (CMI) Afternoon Seminar, Manchester

Facilitating access to administrative records with synthetic data

Raab, 2017/11/22, Dealing with Data 2017 Conference, University of Edinburgh

Synthetic data in practice: software, applications and challenges

Nowok, 2017/09/07, Royal Statistical Society (RSS) 2017 Conference, Glasgow

Stay connected with us

Enter your email address to receive occasional update

Submitting...

Something went wrong

Your email has been received