Generally speaking, you can segment your workflow into sections.
Data input
Process Data
Train/Prediction
You mention the use of multiple files, and that can complicate your initial step, but for a purely threat prediciton; I would say you only need a couple of those like logon, file, email, http, LDAP. Basically anything involved with user activity as you would like to find any outliers in that. You can use the CSV reader for this.
You would then want to use the Joiner node and join on ‘id’ as that seems to be the data point linking the tables together. You could aggregate different things if you group by id and date like for example:
count(logon) – amount of logins each date
sum(email.size) – typical email size the user sends
count(email) – typical emails the user sends in a day
etc.
These are the features you want to be looking for and you can feed this into a couple different models to test which performs well, if you are not sure I would point you to using the AutoML component.