Add Year to Monthly data

Hello friends,

I have a column with prices for future months without the year. Here is how the original data looks like -

image

I need the current year added for everything till Dec and next year added to Jan and beyond. Right now, I am replacing (#1) with “2024” and then adding “2023” to the rest. But, I have to manually change for the rest of the 2024 months. Any help to make this better?

Hi @N904808 , I must confess that when I looked at this problem originally, it was one of those times when I thought that the best thing to do was “start from somewhere else” :rofl:

Now your data is problematic in that it is inconsistent with mix of months where months in subsequent years only contain (#1) if the month name has already been used, and additionally contains two entries for the same “month name”, which use different abbreviations “Sep” and “Sept”.

So, I thought that the best thing to do initially is to “normalise” the names. Since the (#1) and the additional “t” in September don’t actually provide any useful information, let’s start by turning all months into their 3 letter abbreviations. We can do this using a String Manipulation:

substr($Month$,0,3)

This then gives us the following:

image

Now we have something consistent to work with. Ok, let’s see if we can derive a month number for each month:

We can do this using String Manipulation too. If we find the Month in the following String:

"JanFebMarAprMayJunJulAugSepOctNovDec"

using the indexOf function, which gives us a value in the range 0 to 33, then add 3 to that thereby giving a value in the range 3 to 36, we can then divide that result by 3 to give us a value in the required range of 1 to 12:

toInt((indexOf("JanFebMarAprMayJunJulAugSepOctNovDec",$Month$)+3)/3)

image

At this point, I was wondering where to go from here, so I wondered if we could look at the current
row index +1 (so we have “proper” row numbers beginning at 1) and see if that could tell us anything:

image
Ah… look at what happens when we get past December. The month number for Jan is 1 again, and the PositionIndex is 8. So what? Well, what if the current year’s months always have a monthnumber that is equal to or greater than the PositionIndex, but next year’s months don’t?

Let’s test that hypothesis. What if our data began at January instead of June?

image
So far so good. At the new year, the month number for Jan in Row12 is 1 when the position index is 13

What if our current year starts at December?

image

The theory still holds. As the Jan in the second row has a month number of 1 which is lower than the position index.

So, we can now determine the year based on whether the month number is lower than the position index.

We could use a rule engine to determine the year based on a simple rule:
image

We could then use a String Manipulation to join the Month and Year columns
image

and after some tidy up work to remove unwanted columns, get the required result:


image

Alternatively, all of that logic can be done in a single String Manipulation call:


image

and then
image

becomes
image

Month names and adding year information.knwf (195.4 KB)

6 Likes

It worked flawlessly. Amazing logic. Thanks for your help.

3 Likes

Hi @N904808 you’re welcome. I’m glad it helped and thank you for marking it as the solution.

1 Like

I have done this in the past based off of the current date, then split the date into multiple columns via sting to date time, then column expression. That really only works for current / forecast data though.

Nice 1 node solution.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.