Complex query

I am fairly new to KNIME and working with a adataset that I need to filter records from. I want to retrieve results from the whole data set by putting conditions on two columns. One column has number value and the other, a string. The query should be to select, lets say X from String column and collect all instances of the number value against X. The number value may appear again in the column, and may not have X in the adjacent cell, I still want to select all instances of it. I hope I am making sense. Please advice.

Hi @MariaAslam and welcome to the KNIME community forum,

Would you please explain your case by providing some examples? An input table and your desired output.

:blush:

Hi, thanks. My DB is about an ID which is not unique and may occur multiple times. I want to gather data, where against the ID, the type instance is X. But, not just that particular row. Let me explain it this way. The ID is actually identifying a series of events. X may occur during that ID is active. Where ever the X event occurs, I want to select the whole series of events to calculate how and when X occured against that ID, or what preceded X and what came after. The ID will be unique for the whole series of event. I am now trying to experiment with JAVA to find a solution.

Filter the rows with the Row Filter node on X value. Then use the Reference Row Filter node on the ID column.

Here is an example:

ref_row_filter_example.knwf (24.2 KB)

:blush:

Hi, and thanks so much. It worked :smiley:

1 Like

Hi, your recommendations were quite helpful the last time. I have another question and I’d really appreciate advice on. I have another dataset that has eventID multiple times with several date records of events taking place in the yymmdd format. Now, against same events, I want to calculate duration of that event and append to a new column. Is there a way of doing so in Knime? Thanks…

If your field recognized by KNIME as Date you can use


For a case look at nodes containing word duration like

If your date field recognized as a string then convert it using

Hi, thanks for these. I have started with using String to date &time. My data starts in 1980s, however, the output date is 2080. Is there a node to tell the specific date? I am trying time shift and date modifiers. But they dont seem to be working.

My dates are in the yymmdd format, so it is in the format 950227. When updated, this date changes to 20950227, instead of 19950227.

Use String To Date/Time (Legacy). It works correctly.

Hi, thanks… it doesnt seem to be able to parse date data in the saved format.

Hi,

Please take a look at this workflow:

event_date.knwf (32.8 KB)

I added “19” to years between 80 and 99 (since you said your “My data starts in 1980s”) and “20” for the rest, then I converted these string values to date and took min and max to calculate the difference.

:blush:

Did you change format to yyMMdd for legacy String to Date/Time?

Wonderful, this seems to have worked wonderfully, thanks. :smiley:

1 Like

I did, it just wouldnt change it to the format I was looking for. Its same with time too. My time input is in the HHmm format, but the built in nodes cater to the hhmmss format, and that makes it harder to use them as is.

Hi again, another question Im afraid :unamused:!!
My data size is pretty big and I am wanting to leash it a little. I talked about the X event occuring in a series of same event ID. These events are timed. Can I use a node that returns an hour before and an hour after X event takes place? Please advice.

1 Like

Hi @MariaAslam,

I really like using dictionary nodes and here is an example to use the Rule-based Row Filter (Dictionary) node to filter records within 1 hour before and after each X (error in my example):

rule_filter_dictionary.knwf (42.0 KB)

The expression in the String Manipulation node may look confusing at first. That’s the expression to build rules based on the values in each row. Feel free to ask further questions.

:blush:

1 Like

works great for my query. Thanks

1 Like

On the same note,I want to perform some data analytics exercises and eventually visualise he data. My data is in two seperate files. I want to keep them seperate. Does KNIME support analytics and visualizations when perofrmed by retreiving data from two different datasets?

What visualization do you want to perform?

:blush:

My dataset has two branches. The first branch includes an event A that may have occured over a number of days, and each new observation is recorded by a number of observers with the time of observation. Now this has an event ID and two sets of classifications (with sub classes) that occur whenever an observation for event A is made.
I want to visualise the co relation between the two classifications to see whether one impacts ocuurence of the other.
My second branch is of an event of significance that might occur during event A. However, the event of significancee have classes and most significant event is X, which is recorded by the date it occured, along with start date, end date and date it reached its maximum threshold. For this I want to see where event X is likely to occur during event A, and whether there is a relation between two classifications occuring during event A or not, towards occurence of event X.
I hope I am amking sense.