ChatGPT/LLM guidance

Feynix · November 12, 2024, 7:13pm

Hello,
I currently have a workflow, where I have a data set with a large number of Operating system names in a specific column. They are often not in a specific schema and are inconsistent. Currently, my process is to do a groupby on the operating system column and then write the results to an excel file.

I then copy and paste the excel text into GPT and ask it to return a table with the OS names normalized, and then to provide the end of support dates for each OS.

Often it works great. But there are limitations. For larger data sets I need to break it into chunks so GPT doesn’t skip items.

The other challenge is the increased labor of writing to a file, copy pasting into GPT, Copy pasting the results back to the file and then importing it back into Knime as a mapping table.

Would someone be able to point me to how I could accomplish using GPT from within the workflow where I feed it my groupby of the operating systems and have it populate the empty columns of Normalized name and End of Support? I have attached the mapping table that I create with the groupby, and have manually filled in a few examples.
OSmapping2.xlsx (9.6 KB)

mlauber71 · November 12, 2024, 8:22pm

@Feynix there was a somewhat similar task where you would feed raw strings in chunks to a LLM and ask for a structured return. Maybe you can adapt the approach. Key will be to engineer the prompt to tell the model what to do.

In this case it uses a local LLM (Llama 3.2) but ist should also be possible to do this connecting to ChatGPT via KNIME nodes.

Feynix · November 12, 2024, 8:47pm

Interesting. Thanks! I will take a look at that and play around with it.