Reformatting tabular data, is it unpivoting? Pivoting, ungrouping, some sort of loop processing?

Hi there, hope everyone is safe and well. I am trying to reformat a data set from JSON Path and I absolutely cannot figure out how. In Excel, since each data type (or data point) has a fixed number of set (each gets 10 placeholders whether data exists or not), I have a formula and references hardcoded to transform the data set. I’ve been trying to move that part of the process to KNIME to ETL into database, but I am not able to figure out how and it’s still on Google Sheets with a bunch of fixed references and formula sets. Google Sheets is slow so wanting to process it in KNIME.

Here is the sample data set.

Original

Want to change it to this format

Here is the Excel with sample data set.
sample data.xlsx (19.7 KB)

This API returns diagnostic data about single URL and after parsing the JSON, I have single row representing a URL with all the metrics as cols (600+). About 300 those come in as set of 10 whether the data exists or not. I want to reformat those that come in as set of 10, moving cols into rows with 1 URL row now becoming 10 URL rows. Hope the sample data set makes sense. I’m totally willing to spend the time to learn and work on it, but at this point after some chunk of hours/days I am not clear on what nodes are appropriate for this task…piviot/unpivot, group/ungroup, some sort of loop processing… If someone could kindly point me to the right direction or share some node ideas, I’d like to follow your guidance and try out creating a workflow.

I appreciate in advance for your time and kind support. Hope you have a great day. Thank you!!

Just on a whim, I used the data you provided and created collection columns for JSpath, JSpathSavings, CSSpath and CSSpathSavings, and then did an ungroup.

The fact that the columns in the Excel file that you provided that you’d like to group do not have the same name is an issue. This approach worked for this small dataset, but if you have hundreds of metrics, then this isn’t really practical.

What does the JSON file look like inside? It would be easier if the JSON to Table process resulted in collection columns.

4 Likes

Hi @elsamuel
Thank you so much for your help and reply. I appreciate you educating me with a new trick. I have a feeling you are probably correct about working on the original json being easier (I couldn’t figure out all the options to make it work). Attached is the json from the API and my JSON Path node. I have other processes that I join later but this is the portion I’m trying to rework. Thank you very much and I’ll take a look at your workflow and Create Collection Column nodes and how you are making it all work, and your point about the col name being possible issue.

Thank you again!!

KNIME_project test 08022021.knwf (15.7 KB)
json 08022021.json (974.1 KB)

That JSON file is quite daunting. I’m not 100% sure that this will help, but I envisioned the JSON Path node being configured like this:

For example, instead of using separate queries
$['lighthouseResult']['audits']['unused-javascript']['details']['items'][0]['url']
$['lighthouseResult']['audits']['unused-javascript']['details']['items'][1]['url']
$['lighthouseResult']['audits']['unused-javascript']['details']['items'][2]['url']
$['lighthouseResult']['audits']['unused-javascript']['details']['items'][3]['url']
etc

try using a single collection query:
$['lighthouseResult']['audits']['unused-javascript']['details']['items'][*]['url']

Make sure that the “List” checkbox is ticked.

You’ll get a single row with the query result as a List Column:

and ungrouping will give you rows for each value:

If you use collection queries wherever possible for your fields of interest, you can save yourself some effort and then the ungrouping will give you columns that seem to be what you’re asking for.

Let me know if I’m misunderstanding what you’re trying to do.

4 Likes

Hello @elsamuel ,
Thank you again for your guidance and support. I’m going to replicate your suggestion and compare the results against the other Excel based process I have. I see the use of * wildcard (again appreciate for a new trick) and the use of List. That’s certainly much cleaner and more efficient way to accomplishing this. I’ll work on this and QA it. Thank you again so much for taking the time to help me and the community. Much appreciated. Have a wonderful day, @elsamuel.

1 Like

Hello @elsamuel
I QA’ed the data and indeed this is exactly what I need. Thank you so much for your clear and efficient guidance. And I’m also going to update some of legacy stuff with List + Ungroup. There are several nested JSON arrays in this json file and I’m going to see if I can borrow your technique to be able to parse this info (still don’t know if it’s applicable or nested requires a different technique). I wasn’t able to parse this before (the nested object was too complex for me) but this gives me hope. :slight_smile: Thank you and have a wonderful day, @elsamuel .

“third-party-summary” : {
“id” : “third-party-summary”,
“title” : “Reduce the impact of third-party code”,
“description” : “Third-party code can significantly impact load performance. Limit the number of redundant third-party providers and try to load third-party code after your page has primarily finished loading. Learn more.”,
“score” : 0,
“scoreDisplayMode” : “binary”,
“displayValue” : “Third-party code blocked the main thread for 1,140 ms”,
“details” : {
“type” : “table”,
“summary” : {
“wastedMs” : 1137.18,
“wastedBytes” : 820302
},
“headings” : [ {
“subItemsHeading” : {
“itemType” : “url”,
“key” : “url”
},
“text” : “Third-Party”,
“key” : “entity”,
“itemType” : “link”
}, {
“granularity” : 1,
“text” : “Transfer Size”,
“itemType” : “bytes”,
“key” : “transferSize”,
“subItemsHeading” : {
“key” : “transferSize”
}
}, {
“granularity” : 1,
“key” : “blockingTime”,
“itemType” : “ms”,
“subItemsHeading” : {
“key” : “blockingTime”
},
“text” : “Main-Thread Blocking Time”
} ],
“items” : [ {
“blockingTime” : 743.3879999999999,
“mainThreadTime” : 1425.9920000000004,
“transferSize” : 141018,
“subItems” : {
“items” : [ {
“transferSize” : 17236,
“url” : “https://cdnssl.clicktale.net/www/latest-WR110.js”,
“blockingTime” : 353.944,
“mainThreadTime” : 606.2320000000002
}, {
“transferSize” : 51417,
“url” : “https://cdnssl.clicktale.net/www03/ptc/16f34ea7-0b5d-4dcd-a80d-679f43742af7.js”,
“mainThreadTime” : 392.83600000000035,
“blockingTime” : 271.456
}, {
“mainThreadTime” : 211.08800000000002,
“transferSize” : 32750,
“blockingTime” : 59.608000000000004,
“url” : “https://cdnssl.clicktale.net/www/WR1113b.js
}, {
“blockingTime” : 58.379999999999995,
“mainThreadTime” : 173.96000000000004,
“url” : “https://cdnssl.clicktale.net/pcc/16f34ea7-0b5d-4dcd-a80d-679f43742af7.js?DeploymentConfigName=Release_20210618&Version=1”,
“transferSize” : 27560
}, {
“mainThreadTime” : 41.87599999999998,
“blockingTime” : 0,
“url” : “https://cdnssl.clicktale.net/ptc/16f34ea7-0b5d-4dcd-a80d-679f43742af7.js”,
“transferSize” : 8960
} ],
“type” : “subitems”
},
“entity” : {
“type” : “link”,
“text” : “Clicktale”
}
}, {
“subItems” : {
“items” : [ {
“blockingTime” : 144.872,
“mainThreadTime” : 231.71600000000004,
“url” : “https://www.google-analytics.com/analytics.js”,
“transferSize” : 20324
} ],
“type” : “subitems”
},
“transferSize” : 22215,
“mainThreadTime” : 231.71600000000004,
“entity” : {
“type” : “link”,
“url” : “Analytics Tools & Solutions for Your Business - Google Analytics”,
“text” : “Google Analytics”
},
“blockingTime” : 144.872
}, {
“mainThreadTime” : 672.4959999999995,
“subItems” : {
“type” : “subitems”,
“items” : [ {
“transferSize” : 97675,
“mainThreadTime” : 631.0039999999996,
“url” : “https://www.googletagmanager.com/gtm.js?id=GTM-N9JQM5L”,
“blockingTime” : 135.656
}, {
“url” : “https://www.googletagmanager.com/gtag/js?id=DC-8163242”,
“blockingTime” : 0,
“transferSize” : 38144,
“mainThreadTime” : 41.492000000000004
} ]
},
“entity” : {
“text” : “Google Tag Manager”,
“type” : “link”,
“url” : “Website Tag Management Tools & Solutions - Google Tag Manager
},
“transferSize” : 135819,
“blockingTime” : 135.656
}, {
“mainThreadTime” : 237.244,
“entity” : {
“type” : “link”,
“url” : “https://www.facebook.com”,
“text” : “Facebook”
},
“transferSize” : 103693,
“blockingTime” : 61.16,
“subItems” : {
“type” : “subitems”,
“items” : [ {
“mainThreadTime” : 181.22,
“transferSize” : 74897,
“url” : “https://connect.facebook.net/signals/config/1707870039515690?v=2.9.44&r=stable”,
“blockingTime” : 61.16
}, {
“transferSize” : 27362,
“mainThreadTime” : 56.023999999999994,
“url” : “https://connect.facebook.net/en_US/fbevents.js”,
“blockingTime” : 0
} ]
}
}, {
“entity” : {
“text” : “Lytics”,
“type” : “link”
},
“mainThreadTime” : 832.1920000000017,
“subItems” : {
“items” : [ {
“url” : “https://c.lytics.io/api/tag/26fa268766c6754e16738d81423acd94/latest.min.js”,
“blockingTime” : 22.895999999999994,
“mainThreadTime” : 529.7280000000015,
“transferSize” : 19985
}, {
“blockingTime” : 6.840000000000003,
“url” : “https://c.lytics.io/static/pathfora.min.js”,
“mainThreadTime” : 116.72800000000001,
“transferSize” : 22487
}, {
“mainThreadTime” : 8.156,
“transferSize” : 4293,
“url” : “https://c.lytics.io/static/pathfora.min.css”,
“blockingTime” : 0
}, {
“url” : “Other resources”,
“blockingTime” : 0,
“transferSize” : 8115
} ],
“type” : “subitems”
},
“blockingTime” : 29.735999999999997,
“transferSize” : 54880
}, {
“transferSize” : 53075,
“blockingTime” : 22.367999999999995,
“subItems” : {
“type” : “subitems”,
“items” : [ {
“url” : “https://s.go-mpulse.net/boomerang/RNZHL-7D9KZ-P5XZT-ZFA6A-QAEPA”,
“mainThreadTime” : 314.1360000000005,
“transferSize” : 50666,
“blockingTime” : 22.367999999999995
} ]
},
“entity” : {
“text” : “mPulse”,
“url” : “Akamai mPulse: A Real User Monitoring Solution | Akamai Developer”,
“type” : “link”
},
“mainThreadTime” : 314.1360000000005
}, {
“transferSize” : 139555,
“entity” : {
“type” : “link”,
“url” : “https://fonts.adobe.com/”,
“text” : “Adobe TypeKit”
},
“subItems” : {
“items” : [ {
“transferSize” : 22870,
“blockingTime” : 0,
“mainThreadTime” : 0,
“url” : “https://use.typekit.net/af/abc1c3/00000000000000003b9b0ac9/27/l?primer=f4aa41106fb774442dab307ab1e84310ecf971b4559d840448c3364f93fddb68&fvd=n6&v=3
}, {
“url” : “https://use.typekit.net/af/19a2f0/00000000000000003b9b0ac7/27/l?primer=f4aa41106fb774442dab307ab1e84310ecf971b4559d840448c3364f93fddb68&fvd=n7&v=3”,
“transferSize” : 22690,
“blockingTime” : 0,
“mainThreadTime” : 0
}, {
“transferSize” : 22394,
“mainThreadTime” : 0,
“blockingTime” : 0,
“url” : “https://use.typekit.net/af/343335/00000000000000003b9b0ad0/27/l?primer=f4aa41106fb774442dab307ab1e84310ecf971b4559d840448c3364f93fddb68&fvd=n3&v=3
}, {
“transferSize” : 21734,
“url” : “https://use.typekit.net/af/4b34d2/00000000000000003b9b0acf/27/l?primer=f4aa41106fb774442dab307ab1e84310ecf971b4559d840448c3364f93fddb68&fvd=i4&v=3”,
“mainThreadTime” : 0,
“blockingTime” : 0
}, {
“blockingTime” : 0,
“url” : “https://use.typekit.net/af/7f09be/00000000000000003b9b0acb/27/l?primer=f4aa41106fb774442dab307ab1e84310ecf971b4559d840448c3364f93fddb68&fvd=n8&v=3”,
“mainThreadTime” : 0,
“transferSize” : 21562
}, {
“transferSize” : 28305,
“url” : “Other resources”,
“blockingTime” : 0
} ],
“type” : “subitems”
},
“blockingTime” : 0,
“mainThreadTime” : 13.772
}, {
“mainThreadTime” : 255.50400000000022,
“subItems” : {
“type” : “subitems”,
“items” : [ {
“url” : “https://siteintercept.qualtrics.com/dxjsmodule/CoreModule.js?Q_CLIENTVERSION=1.57.0&Q_CLIENTTYPE=web&Q_BRANDID=ehi”,
“mainThreadTime” : 27.956000000000003,
“blockingTime” : 0,
“transferSize” : 30053
}, {
“transferSize” : 16328,
“blockingTime” : 0,
“url” : “https://siteintercept.qualtrics.com/dxjsmodule/10.abbf6da5cd84d6c2bbaf.chunk.js?Q_CLIENTVERSION=1.57.0&Q_CLIENTTYPE=web&Q_BRANDID=www.enterprise.com”,
“mainThreadTime” : 170.85200000000023
}, {
“blockingTime” : 0,
“mainThreadTime” : 20.444,
“transferSize” : 11373,
“url” : “https://siteintercept.qualtrics.com/dxjsmodule/WebResponsiveDialogModule.js?Q_CLIENTVERSION=1.57.0&Q_CLIENTTYPE=web&Q_BRANDID=ehi
}, {
“transferSize” : 7055,
“blockingTime” : 0,
“mainThreadTime” : 11.287999999999998,
“url” : “https://siteintercept.qualtrics.com/dxjsmodule/1.a36539ec8a96ba07fecb.chunk.js?Q_CLIENTVERSION=1.57.0&Q_CLIENTTYPE=web&Q_BRANDID=ehi
}, {
“url” : “Other resources”,
“blockingTime” : 0,
“transferSize” : 14899
} ]
},
“entity” : {
“type” : “link”,
“text” : “Qualtrics”
},
“blockingTime” : 0,
“transferSize” : 79708
}, {
“mainThreadTime” : 156.61599999999999,
“blockingTime” : 0,
“entity” : {
“type” : “link”,
“url” : “https://www.doubleclickbygoogle.com/”,
“text” : “Google/Doubleclick Ads”
},
“transferSize” : 27306,
“subItems” : {
“type” : “subitems”,
“items” : [ {
“transferSize” : 14688,
“blockingTime” : 0,
“url” : “https://www.googleadservices.com/pagead/conversion_async.js”,
“mainThreadTime” : 54.81199999999998
}, {
“blockingTime” : 0,
“transferSize” : 12618,
“url” : “Other resources”
} ]
}
}, {
“transferSize” : 22030,
“mainThreadTime” : 77.40000000000002,
“entity” : {
“type” : “link”,
“text” : “Pinterest”,
“url” : “https://pinterest.com/
},
“subItems” : {
“items” : [ {
“url” : “https://s.pinimg.com/ct/lib/main.89cd5bf4.js”,
“blockingTime” : 0,
“transferSize” : 17830,
“mainThreadTime” : 72.63200000000002
}, {
“url” : “Other resources”,
“transferSize” : 4200,
“blockingTime” : 0
} ],
“type” : “subitems”
},
“blockingTime” : 0
}, {
“entity” : {
“url” : “https://www.thetradedesk.com/”,
“text” : “The Trade Desk”,
“type” : “link”
},
“mainThreadTime” : 181.50399999999996,
“blockingTime” : 0,
“transferSize” : 10557,
“subItems” : {
“items” : [ {
“blockingTime” : 0,
“mainThreadTime” : 4.704000000000001,
“transferSize” : 5094,
“url” : “https://js.adsrvr.org/up_loader.1.1.0.js
}, {
“transferSize” : 5463,
“url” : “Other resources”,
“blockingTime” : 0
} ],
“type” : “subitems”
}
}, {
“subItems” : {
“items” : [ {
“transferSize” : 9736,
“mainThreadTime” : 12.06,
“url” : “https://bat.bing.com/bat.js”,
“blockingTime” : 0
} ],
“type” : “subitems”
},
“blockingTime” : 0,
“transferSize” : 10317,
“mainThreadTime” : 13.352,
“entity” : {
“text” : “Bing Ads”,
“type” : “link”,
“url” : “https://bingads.microsoft.com
}
}, {
“entity” : {
“url” : “Google APIs Explorer  |  Google Developers”,
“text” : “Other Google APIs/SDKs”,
“type” : “link”
},
“transferSize” : 8577,
“mainThreadTime” : 5.1,
“subItems” : {
“type” : “subitems”,
“items” : [ {
“transferSize” : 5588,
“blockingTime” : 0,
“mainThreadTime” : 5.1,
“url” : “https://storage.googleapis.com/ehipersonalization/pf_overrides.min.css
} ]
},
“blockingTime” : 0
}, {
“mainThreadTime” : 19.199999999999996,
“transferSize” : 3450,
“entity” : {
“text” : “Twitter Online Conversion Tracking”,
“type” : “link”
},
“blockingTime” : 0,
“subItems” : {
“items” : ,
“type” : “subitems”
}
}, {
“entity” : {
“type” : “link”,
“url” : “Adobe Experience Platform”,
“text” : “Adobe Tag Manager”
},
“transferSize” : 2838,
“subItems” : {
“type” : “subitems”,
“items” :
},
“blockingTime” : 0,
“mainThreadTime” : 1.012
}, {
“entity” : {
“text” : “Adobe Test & Target”,
“type” : “link”
},
“blockingTime” : 0,
“mainThreadTime” : 0,
“transferSize” : 1839,
“subItems” : {
“items” : ,
“type” : “subitems”
}
}, {
“mainThreadTime” : 7.912,
“entity” : {
“text” : “TRUSTe”,
“type” : “link”
},
“subItems” : {
“type” : “subitems”,
“items” :
},
“blockingTime” : 0,
“transferSize” : 1318
}, {
“mainThreadTime” : 0,
“entity” : {
“type” : “link”,
“text” : “Akamai”,
“url” : “https://www.akamai.com/
},
“blockingTime” : 0,
“transferSize” : 1242,
“subItems” : {
“items” : ,
“type” : “subitems”
}
}, {
“entity” : {
“text” : “BlueKai”,
“type” : “link”
},
“blockingTime” : 0,
“transferSize” : 865,
“subItems” : {
“type” : “subitems”,
“items” :
},
“mainThreadTime” : 0
} ]
}
}

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.