fill down dates and fields based on previous value or change in value

ebarr · November 3, 2023, 5:30pm

i included the table at the bottom - but i have a list of values where column A have dates joined together with dates in column B (two different tables). I want to show reports in two ways - tied to the first date (grouped by) and by the second date (see the ‘waterfall’ of activity over time after original date).

This is for media campaigns where I run a campaign on DAY 1, and receive leads on DAY1, 2, 3…n.

The yellow fields are ‘filled down’ based on the previous row when i have a date populated in date 1 AND new campaign in column c. Each time the campaign changes AND there’s a date in date 1, i want to ‘fill down’ or repeat the date from above, then start over at next campaign.

IF there’s a change to the campaign, but there isn’t a date in date 1, then I want to return date 2. This means there’s something wrong with the data, or there was a campaign that ran outside of the date range that i don’t have the original date for the media, therefore, i want to return date 2 and treat it like a pure ‘waterfall’ - these dates are in red. you’ll note that the yellow are fill down from previous row based on change in campaign, and red is return corresponding date 2 every time there’s a change in campaign without a populated date 1.

I’ve done this in alteryx fairly simply with a multi row formula tool where i can look at row -1 and set rules based on that (row -1 campaign = row 0 campaign AND DATE in row -1 is populated and row 0 date is null, then return date from row -1 else return date 2 from row 0.

Date	Date 2	column g
2022-09-08	2022-09-08	name a
2022-09-09	2022-09-09	name a
2022-09-10	2022-09-10	name a
2022-09-11	2022-09-11	name a
2022-09-12	2022-09-12	name a
2022-09-13	2022-09-13	name a
2022-09-13	2022-09-14	name a
2022-09-13	2022-09-15	name a
2022-09-13	2022-09-16	name a
2022-09-13	2022-09-17	name a
2022-09-13	2022-09-18	name a
2022-09-13	2022-09-19	name a
2022-09-20	2022-09-20	name a
2022-09-20	2022-09-21	name a
2022-09-20	2022-09-22	name a
2022-09-20	2022-09-23	name a
2022-09-20	2022-09-24	name a
2022-09-20	2022-09-25	name a
2022-09-20	2022-09-26	name a
2022-09-27	2022-09-27	name a
2022-09-27	2022-09-28	name a
2022-09-27	2022-09-29	name a
2022-09-27	2022-09-30	name a
2022-09-27	2022-10-01	name a
2022-09-27	2022-10-02	name a
2022-09-27	2022-10-03	name a
2022-10-04	2022-10-04	name a
2022-10-04	2022-10-05	name a
2022-10-04	2022-10-06	name a
2022-10-04	2022-10-07	name a
2022-10-04	2022-10-08	name a
2022-10-04	2022-10-09	name a
2022-10-04	2022-10-10	name a
2022-10-11	2022-10-11	name a
2022-10-11	2022-10-12	name a
2022-10-11	2022-10-13	name a
2022-10-11	2022-10-14	name a
2022-10-11	2022-10-15	name a
2022-10-11	2022-10-16	name a
2022-10-11	2022-10-17	name a
2022-05-01	2022-05-01	name b
2022-05-06	2022-05-06	name b
2022-08-09	2022-08-09	name b
2022-08-10	2022-08-10	name b
2022-09-18	2022-09-18	name b
2021-01-11	2021-01-11	name c
2021-01-12	2021-01-12	name c
2021-01-23	2021-01-23	name c
2021-01-31	2021-01-31	name c
2021-04-06	2021-04-06	name c
2021-10-25	2021-10-25	name c
2022-07-26	2022-07-26	name c
2022-10-13	2022-10-13	name c
2022-04-27	2022-04-27	name d
2020-11-28	2020-11-28	name e
2021-04-22	2021-04-22	name e
2021-11-03	2021-11-03	name e
2021-03-03	2021-03-03	name f
2023-03-11	2023-03-11	name f
2021-06-14	2021-06-14	name g
2021-08-11	2021-08-11	name g
2023-03-25	2023-03-25	name g

hmfa · November 3, 2023, 10:43pm

Hi @ebarr.

Take a look at this workflow.

Maybe it can help you.
Br

ebarr · November 15, 2023, 5:02am

Thank you for the multi row - this looks like it works for a ‘fixed’ number of blank rows. When I have a value then a blank, then a value, the blank works as intended. When there are multiple blank values that need to iterate based on the new value above, I would expect that the value will ‘fill down’ as long as a value populates the row above.

I wrote this code for the column expression:
// Check if this is the first row, as there is no previous row to compare with
if (column(“Counter”) === 0) {
column(“Date Sourced”); // For the first row, use the ‘date sourced’ from the current row
} else {
// Retrieve current and previous rows’ data
var currentCampaign = column(“Campaign name”);
var previousCampaign = column(“Campaign name”, -1);
var currentDateSourced = column(“Date Sourced”);
var previousDateSourced = column(“Date Sourced”, -1);
var currentDateUnsourced = column(“Date Unsourced”);

// Logic for date comparison
if (currentCampaign === previousCampaign && previousDateSourced !== null) {
    previousDateSourced;
} else if (currentCampaign !== previousCampaign && previousDateSourced === null) {
    currentDateUnsourced;
} else if (currentCampaign !== previousCampaign && previousDateSourced !== null) {
    currentDateUnsourced;
} else if (currentDateSourced !== null) {
    currentDateSourced;
} else {
    // Default case, if needed
    null;
}

}

I created an “index” called counter using the count generator to check for the very first row to be blank.

I copied the ‘date sourced’ into new column so I could compare to see if values were overwritten. We can ignore that column:

IN the image above, we can see that date source is 9/5/23, then repeats again in the next line (as expected), but the next two date sources are blank. The code says that if the current date sourced is blank, and the previous campaign is the same, then return the previous date sourced if populated. So, given that the second 9/5/23 was blank, then populated when the rule iterated, I would want the next value to populate given the formerly blank, now populated with 9/5/23 to ‘fill down’ to the next level. I opened the window to ‘3’ in the advanced tab, I thought that would help? I tried integrating a loop, but that went over my head.

Also, 9/12/23 is duplicated, so the rule worked with that row since there was only one blank date sourced after 9/12, then 9/19 was populated with two blank - one filled, the next remained blank, and 9/26 replicated (as expected). Make sense?

takbb · November 15, 2023, 7:38am

HI @ebarr , you are right that the Column Expressions node is deficient in this respect compared with the Alteryx equivalent, in that whilst it can act on the input source from previous rows it does not “remember” changes that it has made to the outputs for previous rows. This means that is limited in what it can repeatedly “fill down” or in its ability to perform cumulative computations, as you have discovered.

For repeatedly filling down, the most suitable node generally is the Missing Value node, which can be set to fill down the previous values. This works cumulatively on any number of missing rows.

The problem that you of course face is that you only want to fill down where your campaign has not changed, and in some cases wish to “fill across” rather than “fill down”, if no value to fill down is available.

So I see two options (there may be others!). The first is the “simple” (no code option) which I would suggest for you to try.

This would be that you use a Group Loop, which iterates the table in groupings of “Campaign Name”. Within the loop perform a Missing Value node to fill down the dates based on previous value

After the loop perform either a Rule Engine to set Date Sourced to Date Unsourced, where Date Sourced is missing, or alternatively a Column Merger node can perform this operation.

i.e. it would look like this:

The alternative to this would be a java snippet which can be written to do what you require, but see if the above can work for you. I could provide a java snippet, which would be a very simple piece of java code similar to the Column Expressions code above, but you would need to upload a file containing some sample data.

See how you get on with the loop idea first though, as this is a general solution to the “fill-down within groups” problem.

ebarr · November 15, 2023, 4:25pm

thank you! I would appreciate the java snippet.
sample data:

Date Sourced	Date Unsourced	Campaign name
	9/1/22	name 1
	9/2/22	name 1
	7/14/22	name 2
9/8/22	9/8/22	name 3
	9/9/22	name 3
	9/10/22	name 3
	9/11/22	name 3
	9/12/22	name 3
9/13/22	9/13/22	name 3
	9/14/22	name 3
	9/15/22	name 3
	9/16/22	name 3
	9/17/22	name 3
	9/18/22	name 3
	9/19/22	name 3
9/20/22	9/20/22	name 3
	9/21/22	name 3
	9/22/22	name 3
	9/23/22	name 3
	9/24/22	name 3
	9/25/22	name 3
	9/26/22	name 3
9/27/22	9/27/22	name 3
	9/28/22	name 3
9/29/22	9/29/22	name 3
	9/30/22	name 3
	10/1/22	name 3
5/1/22	5/1/22	name 4
5/6/22	5/6/22	name 4
8/9/22	8/9/22	name 4
	8/10/22	name 4
	9/18/22	name 4
1/11/21	1/11/21	name 5
	1/12/21	name 5
1/23/21	1/23/21	name 5
	1/31/21	name 5
	4/6/21	name 5
	10/25/21	name 5
	7/26/22	name 5
10/13/22	10/13/22	name 5
4/27/22	4/27/22	name 6
11/28/20	11/28/20	name 7
4/22/21	4/22/21	name 7
	11/3/21	name 7
3/3/21	3/3/21	name 8
	3/11/23	name 8
6/14/21	6/14/21	name 9
	8/11/21	name 9
3/25/23	3/25/23	name 9

the first part is to ensure that the list is properly sorted, so this should be sorted by campaign name, date unsourced, date sourced. Now that it is sorted, the rule should see if there is a date above. IN this case, there is not a date sourced in the first position, so the date unsourced should be returned. This should occur for the first three fields because we can assume that these records originated from a ‘date sourced’ outside of the filter set, so we can’t put these responses back to the original source.

The fourth field has a date (9/9/22). The fifth record is blank. Since the row above is the same campaign (campaign name 3) and the previous row date sourced is populated, then the row should be returned with that value (9/8/22) as well as the next blank fields. The date changes to 9/13 so since it is populated, this fields remain 9/13/22, then the following record is blank, with the previous campaign name is the same. IF the previous campaign is different and the CURRENT date sourced is blank, then the date unsourced should be returned, not the previous date.

thank you.

takbb · November 15, 2023, 4:32pm

Hi @ebarr, for some reason the formatting seems out on your sample data so copy/paste causes the columns to merge. Would you be able to upload it as an xlsx, or similar. thanks

[Edit: don’t worry - I’ve got it now]

takbb · November 15, 2023, 5:18pm

@ebarr , hopefully I’ve transposed the sample data correctly into the attached.

Java snippet - fill down with rules 20231115.knwf (10.7 KB)

The java snippet processes one row at a time from top to bottom of the table.

The variables defined in the “your custom variables” section are remembered from one row to the next, so we can use it to keep a copy of the output from the previous row.

Assuming no errors in my coding, or understanding, the if statements should represent the rules you gave. At the very end, you can see that it stores the current output “Date Sourced” value and the current campaign name.

At the beginning, it checks if the campaign name for the current row differs from the campaign name stored away. Note that java only has an “equals” method, but it is made “not equals” by the presence of the exclamation ! at the beginning of the condition, so not immediately obvious that this is a “not equals” test

So if the campaign name changes, it resets the “last date” to null.

If the DateSourced is missing (null), it checks if last date is null also. If it is, it uses the DateUnsourced, otherwise it uses the last date.

and… for the second time today , I’ll reference the following post which gives further insights into this feature of java snippet

ebarr · December 1, 2023, 5:57pm

thank you! grateful for all of the help i receive in this forum!

ebarr · December 7, 2023, 11:23pm

this is amazing and works perfectly. I greatly appreciate it!!

takbb · December 8, 2023, 12:21am

That’s great to hear @ebarr. You’re welcome and thanks for marking the solution

system · December 15, 2023, 12:22am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.