I have an excel data file with customer sales transactions.
I would like to build a workflow where i can upload the excel data file or simply just read the excel data file have it connected to open AI. Baesd on the excel data file upload My team can give instructions such as “tell me what are the top 10 best selling products” or “tell me what are the top 10 customers by revenue”. It will run and provide me an output of the results, just text form of the analyis will do. How can i acheive this?
So far i only able to get the nodes setup successfully until this stage only “Credentails Configuration” > “OpenAI Authenticator” > “OpenAI Chat Model Connector”.
I dragged the LLM prompter node, but i am not sure how to read the excel date, include additional details to the prompt such as tell me what are the top 10 customers by revenue" and how to display the data out. anyone can provide some advice? i am really new.
In general in regards to setting up a connection to an LLM you seem to be on the right path. With regards to your use case there are some things to consider:
LLMs thus far are not very good in handling tabular data - in addition I found that most companies are reluctant in sending in their “real data” into the OpenAI API - for data security and privacy reasons as well as due to concerns that their data may be used for training
Approaches that I have seen are to use the LLM to create e.g. an SQL query based on the meta data of a database table and then to use the LLM output(the query) and run it downstream and display the result - this way you don’t need to share the real data except for table meta data (column names, data types etc). Especially with the new Strcutured Output API that was released last week this might be a solid way to go - I’m currently working on a video related to Structured Output and KNIME
There are great resources available provided by the KNIME team on the Gen AI Section on the KNIME Hub: In order to get started and better understand how to build e.g. a chatbot I highly recommend you download the examples and work through them - there will for sure be parts that you can reuse via copy and paste: KNIME for Generative AI – knime – KNIME Community Hub
@bluecar007 you can load CSVs into a Vector Store and then ask questions but your answers wou7ld be limited to the amount of documents you can retrieve and there is no guarantee that you will catch all information. If you want to have aggregations like SUM of then it is nearly impossible for the LLM to get this right.
What can work is if you ask for specific customer numbers or single information.
As @MartinDDDD has mentioned there are several approaches out there where you give a database structure to a LLM and then let it write SQL code and come back with an answer.
I have a project that would use concepts like CrewAI to do such a thing but the configurations is challenging and it needs some power to realize a simple task - which will become and issue for LLMs in general.
For some tasks it might still be best to teach people some SQL