If it is using GPU resources when set to CPU, although it is weird, doesn’t it work for you?
Have do admit I am not very familiar with the GPT4All nodes as I personally use Ollama (this allows me to also use the LLMs I run locally in other capacities - e.g. as endpoints for coding assistants etc.).
Ollama in general is very easy to install these days - only limitation I can see is that it is not that straight forward to use embeddings models…
So whereas I cannot really help you with the question you may want to consider Ollama as an alternative - here’s a KNIME blog that explains the process:
If embeddings indeed is an issue @roberto_cadili has an example WF that shows how to use Ollama embeddings: