Having trouble with categorizing by date, adding programmatically named columns


I hope this question can be answered without too much work on your part. I've been trying to solve this one for a while and I think I may simply be overthinking it.

I've got a table that reflects defects: [ID, date submitted, date closed, severity, etc]. It's only around 5000 rows. I've got another table that shows release dates for software packages [product, release date, release version]. My ultimate goal is to show defect density per release. However, it gets a little complicated because the same defect (any particular row) maybe present in different releases. Before I move on, here's an example:

Defect #555 was submitted on 1/1/2012 and fixed on 1/1/2013. Product A version 1 was release on 12/1/2012, version 2 on 8/8/2012 and version 3 on 2/1/2013.

Therefore, the defect calculation for Product A version 1, 2 and 3 would all include Defect #555, but any release after version 3 would not include #555.

Stated slighly more formally: If a defect is discovered within 6 months of a release, it "belongs" to that release and to all subsequent releases, until fixed.

I thought to do something like adding an integer feature column for each release, e.g. "present in version 3". Things would be easy to manage at that point. But I can't for the life of me figure out how to transform the release table into a bunch of columns. So that's my first question: How do I transform the rows of a release table (one row per release) into column headers of the defect table, where the column length will match the defect table? Can I perform a row operation at the same time--a Java node that calculates whether the defect is present in that release? (For the latter I'm using java date methods; it's as straightforward as it sounds).

My second question is, is there a better way to solve this? I solved the "if found in 6 months" problem quickly by binning, but that was easy because a defect initially belongs to only 1 release. I don't believe the extended problem can be solved by binning. But I'm new to all this so perhaps it can.

Thanks for your support. If I managed to make this completely unclear, please let me know and I'll work harder :)

I don't have an exact solution for you, but are you aware of the Time Difference, Cross Join and Rule Engine nodes?  It sounds like you could use these to check if a bug was fixed in a particular release window and then tally them accordingly using a groupby node. 

Does that help?




I appreciate the tip. I'll give those a whirl and see what I can mess up ^H^H^H^H achieve. Thanks

As soon as I used the cross join I saw how this could be solved. Many thanks!!