Splitting text from a date column

So, having trouble taking everything but the date from the date column in this one. There are extraneous words in there (Опубликовал - published) or the names of authors (Karimova, Администратор_сайта- [site administrator]) or Наиболее важные- the most important). Every time I seem to do it, it puts the words in another column.

I also need to take the time out from every date. I have tried to remove the terms I don’t want first and then split the column on the Cyrillic “B” (which means “in/at”) but can’t seem to get it to work.

The data set is here:(resultsto_0430.json - Google Drive) and the workflow is attached. Thanks in advance!
UnionofKRiZ.knwf (82.1 KB)

I’m not completely clear on what you want to do. Like your last post, do you want to isolate the times and dates? If so, I think the best approach is to split the date column on the Cyrillic в and then keep the dates and times with regex. The problem is that the cell splitter by default doesn’t accept Cyrillic characters. Maybe someone else can weigh in on a way around this.

I think I’ve come up with a solution. Its pretty brute force, Let me know what you think.

2 Likes

Yes… and no. I’m sorry if I wasn’t clear, but the problem with this file is that there a lot of extraneous words in the date column which I just want to eliminate and ONLY have the date itself remain. So, for example, the second article has “Опубликовал Kalimova 28.04.2024 в 15:39 Дела текущие” and I want/need to eliminate everything but 28.04.2024, so I could split it on the B as you suggest but that wouldn’t take care of the first half of the cell. Alternatively, I could ask it to extract any sequence NN.NN.NNNN and put it in a separate column, but I don’t know how one would go about doing that. I hope that explains what I am trying to do a little better.

(the first article stands out because it is listed as “today” [Сегодня] which means the day I scraped it, April 30th).

Best,

Rich

I’m confused. Have you run my workflow? I think it does what you ask. If you don’t know how to download from the Hub let me know and I’ll show you how.

No I have not run your workflow. I have not downloaded it from the hub and I would appreciate you showing me how to do that.

What version of KAP are you running?

It is version 5, and it says it is 5.2.3

Here are a series of screenshots which should get you there. What you’re doing is downloading a knwf file which is a zipped form of the workflow. The “import workflow” step below turns it into an executable workflow. It will create the workflow in the same folder you saved the knwf file in. Must be somewhere in your KNIME workspace







1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.