HttpRetriever / illegal character in path at index 86 ? solution of the problem

umutcankurt · December 27, 2018, 10:57am

Hi;
How can I solve the problem in the workflow?
ThanksKNIME_path index 86 error.knwf (78.1 KB)

armingrudd · December 27, 2018, 2:53pm

Hi,

To be honest and clear, the workflow you provided doesn’t make sense to me.
What are you trying to do? Maybe I can help you much better if I know this.

Best,
Armin

umutcankurt · December 28, 2018, 7:05am

Hello Armin;
That’s what I’m trying to do.

collect sub-page url links at main url
divide the collected sub-page links into pieces
Collect the information by visiting the sub-page url links.

The problem is that I receive an error while visiting the sub-page url links.

armingrudd · December 29, 2018, 9:35am

OK,

There are a few problems with your workflow that must be modified:

1- The get request node was unnecessary and I removed it.
2- The HTMLParser node was reading the initial URL not the result form HttpRetriever and I changed it.
3- When you get the href attribute by XPath node, there may be spaces in the URLs and you have to convert them to “%20” then you can use the URLs. So I added a string manipulation node after XPath node.
The rest of your workflow now works fine.

path error.knwf (75.7 KB)

Best,
Armin

mlauber71 · December 29, 2018, 9:50am

That is a great job now it might be possible to extract some more information. Tough from a first look @umutcankurt might still have some work to do.

umutcankurt · December 29, 2018, 12:55pm

Hi; Armin

Thanks a lot, thanks to people like you, it is very nice to learn and solve problems. Thank you for your hard work again. @armingrudd and @mlauber71

system · June 24, 2019, 10:41pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.