Mini Project/Final

Hello
I’m looking to have ideas of mini project/final for my course. My course is mainly about the

Introductory Course to Data Science

Best
Malik

Hi Malik!

If I got it right you are holding the course and need some ideas for finals? If you can share topics you are covering in it and expectations on how many hours a student should be working on this project someone might come up with an idea :wink:

I guess you are using KNIME there?

Br,
Ivan

Hi Ivan
I’m covering mainly the following topics:

  1. ETL and Data Manipulation
  2. Data Visualization
  3. Predictive Analytics
    I would expect that he will work about 80 hours on it (Two weeks).
    Yes im using Knime.

Best
Malik

Hi @malik,

have you defined your projects themes or still open for ideas?

Don’t have concrete idea but generally what I think is missing in this kind of projects is data collection part. Usually students are given well known data sets and then same/similar analysis are carried away. Data collection part is the prerequisite of any other data action and being done by students would help them understand following steps/topics from your course better.

For example maybe they collect data and figure out that no analysis can be done on it. That would be a valuable lesson :smile:

Br,
Ivan

2 Likes

Hi @ipazin, hi @malik,

What a coincidence! I was pondering exactly the same idea, just a couple of days ago. What got me started on that journey was a particular dataset on data.world, which they have presented in their newsletter. Since I don’t know if you can access the complete description of the dataset, let me quote the interesting piece:

While helpful, the NBER’s website still leaves much to desire for those looking for a clean long-term longitudinal dataset. Earlier research suggested that others were looking for this formulation of the data unsuccessfully, so the purpose of this contribution is to consolidate the 13 years of NBER-processed data into one publicly-available dataset. It is presented here with the NBER’s column headers and “desc.txt” shows the NBER’s 2016 data dictionary.

So I started looking for the raw data and immediately saw the potential to have students start with the raw data to get to “something meaningful”. Here are my initial notes/thoughts on this:

Best,
Stefan

3 Likes

Hi,
I am rolling out a similar course at my instituition. It has just completed 1 round with a batch of 46 groups of students (around the age of 18). Each group comprises about 20 students. Project is a 20% component. For this round, I have used the dataset from Kaggle. I am thinking of any possibility of KNIME holding a platform similar to Kaggle but have people submitting the Knime workflow and ranking their results? However, if it is to support learning, then the dataset should be suitable to achieve the learning outcome. One big challenge I face is to find suitable dataset that can help to achieve the learning outcome. We only teach simple data mining techniques in our course: K-means clustering, classification using Decision Tree and Naive Bayes, estimation using MLR. Any ideas or suggestions to the dataset or a good running of the project component?

2 Likes

Dear @ctienche,

Congratulations! :tada:

We have parts of this proposition already implemented: Executing submitted workflows, scoring their results, and sending out notifications via email is all possible with one workflow that is regularly executed on KNIME Server. This implementation, however, is limited to tasks that can be scored automatically, e.g. classification, regression, …, against a ground truth.

When results are stored in a database, in addition to notifications, it is possible to implement a WebPortal workflow that reads the results, groups by participant, and shows the results in a table.

I like the idea of integration a competitive aspect into classes, but I think the ranking and implementation should be handled on the level of each course instead of yet another Kaggle-like platform.

You are not alone in this: I have heard this feedback in a variety of discussions with other educators. We have started to talk about this in

So let’s continue the discussion in the other topic. Thanks!

Best,
Stefan

1 Like

Can you link me to the resource where one can submit the Knime workflow and rank the results? Does it mean that I need to use the KNIME Server to support such feature?

Here’s sharing what I have implemented to my group this semester:


I have provided students with 4 datasets to choose from (mall customer dataset, wholesale segmentation dataset, Titanic dataset, banking dataset) which are from the UCI data repository.
Attached is the report template that students need to fill up and submit. Template.docx (146.7 KB)
The rubrics for marking each milestone as follows:


3 Likes