Solutions to “Just KNIME It!” Challenge 9 - Season 4

:sun_with_face: Hello hello, Just KNIME It! people! :nerd_face:

:boom: Today we’re back with a finance puzzle proposed by KNINJA @MartinDDDD. :moneybag: You are a data analyst supporting various finance sub-departments in a big company, and your main goal is to upgrade their processes with the latest tools and technology. It’s January 2nd and all of the previous quarter’s supplier invoices need to be processed. Can you assist the finance department and find ways to extract relevant data from XML invoices, create a management reporting package, among other tasks? :memo:

Here is the challenge. Let’s use this thread to post our solutions to it, which should be uploaded to your public KNIME Hub spaces with tag JKISeason4-9 .

:sos: Need help with tags? To add tag JKISeason4-9 to your workflow, go to the description panel in KNIME Analytics Platform, click the pencil to edit it, and you will see the option for adding tags right there. :blush: Let us know if you have any problems!

1 Like

I’m looking forward to this week’s challenge :star_struck:

sadly I’m not able to download the dataset files :sob:
image

does anyone facing same issue?

2 Likes

Too many downloads for what’s supposed to be just a Just KNIME It challenge…
At this point, are we still talking casual enthusiasts, or is this turning into a full-time job disguised as a community sprint? :racing_car:
Honestly, it’s starting to feel more like Formula 1 than data science!
Haha, I was about to share the same thing—but you beat me to it with a perfect undercut! :smile:

1 Like

Hi! Are you still facing this issue? Happy to help.

1 Like

Yes

1 Like

Important note: We wanted to preserve @MartinDDDD’s challenge as much as possible, so we kept the dataset he used. Please let us know if it is becoming a problem to download it and we will work on a sample that is easier to play with!

Thanks!

2 Likes

Thanks for the feedback, @PVergati. Working on it now.

2 Likes

Ok, folks! Here’s a better solution (a sample would change the proposed challenge more than what we’d like): we just zipped the invoices here.

I downloaded here and it worked fine! Please unzip it and go ahead. :smiley:

2 Likes

Really sorry about the issues… it’s just a 600kb .zip… I managed to download it just now… uploading here as an additional source…

Apologies for the inconvenience…

invoices.zip (689.9 KB)

3 Likes

Thank you for the quick support. I am able to download it now :grin:

1 Like

Thank You @MartinDDDD with your attachment I was able to download it as well! :slight_smile:

Just one question: In the third task where we can find the purchase order date? (or the invoice date, as it’s IssueDateTime in the xml). I just found this Date, and the third task asks to compare the invoice date to the purchase order date:
< ram:IssueDateTime>
<udt:DateTimeString format=“102”>20240203</udt:DateTimeString>
</ram:IssueDateTime>

If it’s the part of the challenge to find that date, then I will work on it! :smiley:

The third task:
“3. Assist with performing internal controls – i.e., taking into consideration two datasets provided by the Procurement department you should identify any invoices that (a) do not have a PO Number, (b) do not match a PO Number issued by the company, and (c) have an invoice date which is before the corresponding Purchase Order Date;”

2 Likes

Hi @berti093

for the invoices your are on the right path - for the POs - they should be in the PO Header or Item data set… which I could not find in the folder so attaching them here for reference…

@alinebessa: could you add them to the folder please? :slight_smile:
POs.zip (75.0 KB)

4 Likes

Thought it might be useful to provide some general context on this challenge:

  • Invoices are in XML format as since January 2025 it is mandatory to be able to send/receive e-invoices in a standardised format (with transitionary periods until 2027)
  • The XML format reflects a somewhat simplified standard e-invoice, but data included in them is synthetic (thanks gpt-4o-mini…)
  • In a corporate governance context, typically approval is required before spending company money, once spend has been approved a Purchase Order is issued to the supplier. Suppliers are required to reference Purchase Order number on their invoices
  • This allows for various checks to be performed between purchase order and invoice: Has the spent be approved before services were rendered or items delivered? Has the supplier invoiced what was ordered? Are there any differences between Purchase Order and Invoice (Prices, Quantities), is there a matching Purchase Order to the invoice?
  • Data regarding open purchase orders is also used in Financial Planning and Analysis: when will we spent how much money? Who are the top suppliers we are spending money with?

As you can imagine a lot of the above stuff, to this date, happens in Excel… so I thought it might be fun for everyone to change that :slight_smile:

Have fun and enjoy!

4 Likes

FYI the downloads from the Hub were an issue on our end. They should be working properly now. :+1:

(But if they’re not let us know!)

4 Likes

Hi all,

Interesting one, with some XML files.
My solution is here: JKISeason 4-9 - Processing Invoices at the End of a Quarter – KNIME Community Hub

I have the feeling that it might be a very big project as you can extract so many interesting insights from these datasets.
I just spent some time on the goals defined within the challenge - not much more.

But I enjoyed using the generic e-chart node (along with the K-AI to generate the code) to be able to show some “big number” in the report.

I know I could extract more information, but here what I show in that workflow:

  1. Number of invoices and total amount without PO number
  2. Number of PO and total amount not invoiced
  3. Number of invoices with a date before the PO’s date and total amount
  4. Graph showing the amount to be invoiced in the future
  5. Top 3 suppliers with remaining amoung
  6. 2 other tables with details on invoices with no PO and invoices with a date prior to PO

Note: many other graphs and numbers might be extracted. The graph could be summarized per week for example, to get more insights, instead of per single day.

Happy to discuss with you all.
Feedback welcome.


Regards
Jerome

5 Likes

This is sleak and Swift … Esp the BANs … text formatting been really a challenge esp with colors and highlights… For reports Its vital … thanks for adding the flavour.

1 Like

@trj Your financial controller loves this :heart:

2 Likes

Hi KNIME’ers! :wave: :slightly_smiling_face:

This week’s Just KNIME It! challenge hit me right in the accounting brain I don’t have :brain::money_with_wings: finance terms everywhere and me just trying to survive with inner peace and KNIME nodes :man_in_lotus_position::joy:

Even though I don’t have a strong finance background, I gave it my best shot and completed the challenge in under 24 hours after it was published! :stopwatch::gear:

It wasn’t easy (hello Level: Hard :face_with_spiral_eyes:), but learning never is, so let’s keep going, even when our brains scream “please no more XMLs!” :open_file_folder::jigsaw:

Keep learning, keep KNIME’ing, and don’t forget to laugh through the chaos! :robot::fire::muscle:



4 Likes

Phew, that was real real interesting. In my opinion the best challenge in this season yet. It felt like real life. Messy, you do not exactly know what you’re seeing, what to watch, what to do… I think the inspiration came from real events @MartinDDDD, if that’s the case, it can be felt :smiley:

So my steps was:

  1. Reading the xml and extracting the necessary information
  2. Internal controls
    2.1 Checked if PO is missing
    2.2 Checked if the two PO number is the same in the xml (all of them were the same)
    2.3 Checked if the Invoice date is earlier than the POs date
  3. Supporting Financial Planning & Analysis’ forecast
    3.1 Checked if the product is the same in the PO as in the invoice (all of them were the same)
    3.2 Checked if the total number in the PO and in the invoice for the line item are the same (there was a lot problem here :D)
    3.3 I shifted the dates with 14 months to invoice the leftover amounts (it’s suspicious how much greater the amounts in the forecast, maybe I used wrong field to calculate (Total - Charged), but this was what the data told me :smiley: )
  4. Visualized the results


My workflow:


My solution to the challange:

I’m really open to suggestions to improve my solution. I felt really involved with this challenge, and it was real great, thank you @MartinDDDD! :slight_smile: And if it was really inspired by real challenges I’m really curious how you solved it :smiley:

1 Like

find my submission : JKISeason4-9 – KNIME Community Hub