Check that item data is consitently categorised

Hi all

I have a data set comprising of a couple of hundred thousand lines of picking transactions. Each of the items picked is identified by a distinct, individual SKU number and in categorised according to its characteristics - meat, fish, produce, homeware etc. There may be several hundred transactions picking the same SKU.

I have noticed that a few transactions have anomolas category allocation - eg an item that usually categorised as say meat, comes up as fish - I’ve spotted a couple in the first couple of hundred lines, but obvs can’t go through 200k by eye

Is there a way of comparing the SKU column with thecategory column and returning a count of the anomolies and/or the row numbers where the category and SKU do not match the mode?

Cheers
A

Hello @Andy_D ,
Welcome to KNIME Community.

To solve this problem you can use combination of Group By , Joiner, Rule Engine and Row Filter node.

I have uploaded a demo workflow for better understanding

QnA7.knwf (80.9 KB)

Regards,
Yogesh

4 Likes

Don’t think there is anything that does that job out of the box.

The way that I’d start tackling this is checking the “unique” SKU => Category pairs in your data. Simplest way to do that is by using GroupBy Node and select SKU and Category Columns as Group Columns. Might be worthwhile to add any column as aggregation column under Manual Aggregation tab with Aggregation Method Count.

Then add Sorter Node and Sort by Count Column Ascending. That way you get the SKU => Category pairs at the top that have the lowest count and therefore might be candidates for wrong allocation…

If you can get your hands on “Master Data” that maps any SKU to a valid Category that’d be even better and give more options.

1 Like

Thanks so much Yogesh!

I’ve downloaded your workflow, but can’t seem to open it - I can build the workflow from your diagram, did the demo contain suggested configuration?

Thank you Martin, I really apprceiate that, I’ll give it a go!

Hi @Andy_D,

Are you using the latest version on KNIME Analytics Platform.
You just have to go to Space Explorer ----> Click on three Vertical dots -----> Import workflow
and the open it.

Yes the workflow contains same configuration as the image.

Ah, I can’t see that option on teh three dots.

Looks like v5 of the platform

Its here

Eh, THAT three dots :sweat_smile:

Thanks so much for your patience Yogesh - I’m relatively new to Knime, and this is the first workflow that I’m using in the field

I’ve managed to import now - that’s pretty much working as I was hoping

Thanks so much again Yogesh!

Hi @Andy_D ,

Happy to help.
If you are satisfied with the solution, you can mark solution as solved because this helps people get directly to the solution if they have the same or similar question.

Regards,
Yogesh

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.