Basic Data Preparation

For building, using and testing GDPR Metanodes, youl will need to create data that actually shows the required conditions. In our case, we are interested in personal data, discriminatory fields and pseudo discriminatory fields. To create such data, we use the classic adults.csv dataset. Each row represents an individual who is annonymous. we first add a unique identifier. There is already one "special category" field - race. We use one of those races and artificially create a high correlation between that one race and a particular zip code, thereby introducting possible pseudo discrimination. In addition, we randomly ad another "special category" field for Trade union Membership. In addition, the Model Data Generation nodes can be used to create such data. Further examples of data creation are avaialble on the KNIME Public Example Server.


This is a companion discussion topic for the original entry at https://kni.me/w/f1MkLhcCeRmhXkxA