What are synthetic data?
Artificial individual level data generated for the purpose of reducing disclosure risk by sampling their values from probability distributions that are specified based on the original confidential microdata to reproduce their structure and as many of their statistical properties as possible.
Can you use synthetic data instead of original one?
Synthetic data are only as good as the models that were used to generate them. Since preserving all possible relationships between variables in a complex data set can be very challenging, we strongly encourage validation of results obtained from the synthetic data on the original confidential version of these data.
Can you synthesise any type of data using 'synthpop'?
Data sets with a complex data structure, e.g. hierarchical data, multiple events data, can not be easily synthesised in 'synthpop' at the moment. You can still attempt to synthesise such data but some pre-processing will be required.
Is there a maximum number of variables you can synthesise?
There is no definite maximum number of variables you can synthesise. It depends on the type of variables you have in your data set. Factors with a large number of levels may cause problems even for small data in terms of number of variables.
Why is the package called 'synthpop'?
The package name synthpop stands for sythetic population. It doesn't have much to do with pop music. Kraftwerk was a great trendsetter but its synthesizers and models are of a different kind. If you wanted and hoped to hear some data go here.
What is the meaning of the 'synthpop' logo?
It's a tree enclosed in a circle but not limited by it. The circle is open and the tree can grow. The tree, similarly as data sets generated by synthpop, is artificial. It also resembles visual representation of a classification and regression tree, which is the default method used in synthpop. The circle is open because synthpop is freely available and we hope that synthetic data can open access to confidential data. Further possible developments of synthpop are represented by the growing potential.
Stay connected with us
Enter your email address to receive occasional updates
Stay connected with us
Enter your email address to receive occasional update