Question is what you want the group by to do. I see two paths:
- if you want to remove duplicates consider using DISTINCT (you could further explore the topic of duplicates and SQL here: School of duplicates - and how to deal with them)
- then you could extract the structure of each table and construct a SQL string and insert that via Flowvariable - so you have an individual group by that would adapt to your data structure
You could have a ‘standard’ part like every table hold a no_of_purchases variable that you would have as:
SUM(no_of_purchases) as sum_no_of_purchases
Question is if this would go along with the streaming - because group by would need to know about the whole database in order to do its thing (or might just handle the streamed portion which would not be good).
Depending on what you want to do it might makes sense to do the transformations (grouping) in your original database and then just stream the result - since this topic was (originally) about transfer performance.