Hi everyone,
I have some experience in KNIME however I one can always learn more by asking.
One problem I consider as basic as general but I never found a simple solution for in KNIME is the following one.
Imagine a hotel with guests staying and leaving the rooms. You have a table with guest names, associated with a time and date for arrival and a time and date for leaving.
How do I calculate the number of rooms that is occupied at any given point of time?
And how do I control granularity of this?
Examples:
I’d like to plot how many people are living in the Hotel for every given Day in a year. How would I calculate and plot this best?
How would I vary the granularity of the above mentioned calculation of room occupation? E.g. do the same calculation/plot of Q1-Q4 of a year or for a really spooky hotel owner: calculate occupation on a one second timely resolution…
My guess is that this problem is a rather general case, in the end it refers to any capacity that is used within (possibly) overlapping time frames.
To ease the discussion I generated a small random data set and attached it. The data is a little artificial since it is a fully random data set, however I guess the idea is clear :).
It's simple for one specific date. If you want to get graphs over time I think you then would need to loop the solution for 1 date. Bit more complex but bigger question is efficency / speed especially with more than 50 rows.
Find attached a solution for 1 specific date with a time resolution of 1 day. took about 10 minutes to do this. The hard part with looping and charting I leave up to you,
EDIT:
This obviously assumes 1 guest = 1 room
And granularity can be controlled in the time difference node. I assume this means the hotel thing is not the real problem because there daily resolution is the normal thing to do.
thank you for the example data this is always really helpful :-)
I attached an examplary workflow.
Task 1 is actually easier. Here I generate a +/- 1 table based on if someone is arriving or departing. And finally calculated a cumulative sum of this column. This again can be ploted in a line chart.
Task 2 is more complex. Here we first need to get the mean occupation per day. And therefore we need a list of all days in the time range. Afterwards I join this to the first table, resolve missing days occupation with a missing value node and previous. And finally you can calculate the averages with a groupby node.