I have a dataframe with hundreads of columns. Just for example purposes I’m going to present a toy dataframe. TPT_A_2 | TPT_B_2 | TPT_C_2 | TPT_A_4 | TPT_B_4 | TPT_C_4 | TPT_A_6 | TPT_B_6 | TPT_C_6 | 100 100 100 200 200 200 400 400 400 I want to compute the mean for those variables with the same initial substrings as name (TPT_A, TPT_B…) that end with 2 and 4. So I would get something like: TPT_A_mean | TPT_B_mean | TPT_C_mean | TPT_A_6 | TPT_B_6 | TPT_C_6 | 150 150 150 400 400 400 In R you would get this data like this: row1 <- c("TPT_A_2", "TPT_B_2", "TPT_C_2","TPT_A_4", "TPT_B_4", "TPT_C_4", "TPT_A_6", "TPT_B_6", "TPT_C_6") row2 <- c(100, 100, 100, 200, 200, 200, 400, 40, 400) data <- as.data.frame(rbind(row1, row2)) colnames(data) <- as.character(data[1,]) data <- data[-1,]

Compute for column values based on conditional substrings in the column names

Thyme January 6, 2022, 12:00pm 5

For people in the future, who want to know how my “Column Group Loop Start” looks like, I actually had to implement one in another question:

1 Like