Filtering in/out every second, third ect. Row

 

Hi


I have a Tabel whit a row for every Item that was  bought. I need a to make it a  Tabel whit a row for every customer that bought something and a column for every Item he/she bought.(with empty cells in the last columns if the customer is not the one buying the most items)

I already have a Loop doing that, but that is highly inefficient. I thought that in filter out all coustomers that bought 2 items. fiter out every uneven row and every even row and join them agian using the coustemer ID.
And if there is a way to filter out every third row I can use the same method again. For all custemers that bought more then 8 Items I can probably use the Loop I already have.

Is there a way to fiter every nth row?  

thx
suppe

If I understand your problem correctly:

  • your table has a row for each item bought by a given customer ?
  • your desired output is to have a row per customer with each distinct item in its own column having e.g. the purchased quantity as value ?

If yes to both questions, then the Pivot node will save your day.

No, I want in the row for the customer that bought 7 Items to have the coustomer Id in the first column and in the following columns the item id for the Items he/she bought.  And the same thing for the coustomer that bought 5 items but whit missing values in the last tow columns.

I will ignore the quantity of bought items.

 

Why not use the "group by" node based on Customer ID and then both concatenate and count the number of items that were bought. You can subsequently use the cell splitter node to split out the individual items to separate columns.

 

 

@suppe
An example for the input and desired output would be helpful… When you need groupby, chances are you could also use (unpivot-apply)-pivot instead.

here is the exampel:

 

Input:

Customer Id        Item id
121                  |           687
121                  |           548
121                  |           102
407                  |           608
407                  |           609
809                  |         1017
809                  |           307
809                  |           704
809                  |           596
809                  |           481
809                  |           210


Putput
Customer Id    item id 1  item id 2  item id 3  item id 4  item id 5  item id 6
121               |       687       |    548      |  102        |    ?       |        ?        |           ?
407               |       608       |    609      |     ?          |    ?       |        ?        |           ?
809               |     1017       |    307      |   704       |  596    |       481    |        210

Here is a workflow suggestion with pivot.

Some explanations:

- group loop partitions the table by customer id and allows you to allocate a variable name (called "item name" in this workflow) to each item, the variable name depending on the row number within the group ;

- pivot then uses this variable name column to rotate the item rows per customer id into columns.

If you ask me, you only need 3 nodes, see attached image. 

Here is a shorter pivot solution (3 nodes + renaming):

  • RowID: append row ID column as new column (e.g. rowid);
  • Rank: rank attribute = rowid, group attribute = customer id, name of rank attribute = rank and tick Retain row order;
  • Pivot: group = customer id, pivot = rank, manual aggregation = first(item id);
  • (if needed) Rename (regex): Search string = ([0-9])+(\+)(item id) , Replacement: $3 $1

If row ID contains anything else than the default Row0 Row1, etc., then it is safer to use Math in the first step with the simple formula ROWINDEX.

2 Likes

I finely got around to check you solutions. nbrooijmans you are right it is quite simpel.

Thx Guys

Geo thanks, this is very helpful.