python代写 python作业代写 python程序代写 代写python程序

python数据挖掘代写 python数据挖掘代做

Assignment 3: Frequent Itemsets, Clustering, Advertising

Formative, Weight (15%), Learning objectives (1, 2, 3), Abstraction (4), Design (4), Communication (4), Data (5), Programming (5)

Due date: 11 : 59pm, 1 June, 2019

1 Overview

Read the following carefully as it di↵ers from the last assignment.

For students who are taking the course COMP SCI 3306 (i.e., undergraduate students), this assignment can be done in groups consisting of two students. If you have problems finding a group partner use the forum to search for group partners.

For other students who are taking the course COMP SCI 7306, this assign- ment should be done individually.

References to sections, examples, etc. refer to the book of “Leskovec, Ra- jaraman and Ullman: Mining Massive Datasets (Second Edition)”.

2 Assignment

Exercise 1 Frequent Itemsets (15+15+10+10 points)

For this exercise, you have to read Section 6.4 up to 6.4.3.

1. Implement the simple, randomized algorithm given in 6.4.1

2. Implement the algorithm of Savasere, Omiecinski, and Navathe (SON al- gorithm) in 6.4.3

3. Compare the two algorithms on the datasets T10I4D100K, T40I10D100K, chess, connect, mushroom, pumsb, pumsb star provided at

     http://fimi.ua.ac.be/data/

and report the outcomes.

1


COMP SCI 3306, COMP SCI 7306 Mining Big Data Semester 1, 2020

4. Experiment with dierent sample sizes in the simple randomized algorithm such as 1, 2, 5, 10% and compare your results (including the result pro- duced by the SON algorithm).

Your approach should be as e

京ICP备2025144562号-1
微信
程序代写,编程代写
使用微信扫一扫关注
在线客服
欢迎在线资讯
联系时间: 全天