Hello KNIME Community,
I hope you’re all doing well !
I’m currently working on an ambitious project and would greatly appreciate your help, guidance, or suggestions. The goal is to automate a solution capable of making phone calls and interacting with the person on the other end.
I know there are paid tools available that offer this functionality, but I’m curious to know if it’s possible to achieve something similar using open-source libraries, Python scripts, AI models, and, ideally, KNIME as the central workflow manager.
I understand that this is a highly complex task, especially since I am neither a developer nor an AI expert. However, I’m eager to learn and explore what’s feasible in this domain.
If anyone has experience with similar projects, ideas on how to start, or suggestions for tools and approaches, your input would be invaluable. Whether it’s about text-to-speech, speech recognition, or integrating KNIME with external systems, I’m open to all insights!
Thank you so much in advance for your time and support.
Hello @Stephane73,
I am not too familiar with this area, but I would say your best bet would be to use Python as you mention. I did a little bit of research on it, and there are some different tools out there to handle it.
Twilio offers an api to handle calls, which you can ping within knime (it says you can try it out for free)
A open source library you can use in python is pycall:
https://pycall.readthedocs.io/en/latest/usage.html
To start, I can see it having a table creator node with your desired initial TTS you want it to say which you can pass via flow variable into your python script. For pycall, it looks like you will need initial setup as it says below:
- A working Asterisk server.
- Some sort of PSTN (public switch telephone network) connectivity. Regardless of what sort of PSTN connection you have (SIP / DAHDI / ZAPTEL / ISDN / etc.), as long as you can make calls, you’re fine.
Going past the initial stage of initiating the call and recording responses, you will need some way of recording responses which could be done in python. There is a library in python called SpeechRecognition · PyPI that can handle this.
Hopefully this can give you some ideas on getting started.
TL
4 Likes
Hi @thor_landstrom
Thank you for your response and for pointing me in the right direction for my project. Your suggestions were very helpful, and I appreciate your support.
Best regards,
Very interesting project I think.
Maybe check out this video in youtube to see what sort of tools / packages can be used for this:
This guy implemented something fairly similar in Python.
I’m currently experimenting a lot with bringing AI Agents into KNIME “natively” (e.g. the low-code way) and am documenting this in a youtube video series:
Second video will be up during the course of today where I also share some insights into how this works “under the hood”.
Also happy to share some of the insights around challenges etc. I have encountered so far if you are interested :-).
3 Likes