Fall 2024: CS 4435/5435 and DASE 4435 Data Mining

[Course Information] [Logistics] [Syllabus and Course Schedule ]


Course Information


Instructor: Oluwatosin Oluwadare Ph.D.

Time and Location

Course Description

This is an introductory course in data mining. Data Mining refers to the process of examining large data repositories, including databases, data warehouses, Web, document collections, and data streams for the task of automatic discovery of patterns and knowledge from them. This course covers an introduction to fundamental concepts, data warehousing, data pre-processing, association rules, cluster analysis, classification and prediction, frequent pattern mining, and advanced data mining applications in bioinformatics, computer vision, Web, text, big data, social networks, and computational journalism.

Student Learning Outcomes

Understanding of the concepts, principles, and techniques in data mining. They will be able to independently analyze data, find patterns in them, and gain an ability to design, implement, and evaluate data mining software.


Logistics


Prerequisites

Textbook

Grades

Note: The final letter grades will be based on the curve of students' performace. Undergraduate and graduate students are to be compared in 2 separate groups

Assessment

Attendance

To be successful in this course, you (the student) need to attend every lecture. Students are required to attend lectures.

Announcements

All announcements will be made on Canvas and also posted on the course website.

Assignments and Deadlines

Midterm Exam

Course Project

Reading Assignments

Students will complete a series of literature review for related topics and research-oriented project. Examples of possible projects include social network mining, face and object data mining.

Regrading

Regrading request must be made within 7 days after we post scores on Canvas. Instructor or TA will handle regrade requests. If student is not satisfied with the regarding results, you get 7 days to request again. The instructor will regrade, and the decision is final.

Graduate Students

Graduate Students enrolled in the graduate course (CS 5435) may have at least one additional challenging problem on each homework and exam.

Course Material References

Some of the course material is organized following the the Data Mining Class taught by Prof. Chengkai Li at the University of Texas, Arlington and Prof. Jianlin Cheng at the University of Missouri, Columbia.


Syllabus and Course Schedule


As the instructor for this course, I reserve the right to adjust this schedule in any way that serves the educational needs of the students enrolled in this course. - Oluwatosin Oluwadare

Acronyms: Homework = HW, Reading Assignment = RA, Course Project = CP
Date # Lecture Assignment Lecture Notes Extra Reading
Out Due
Tues, 27 Aug 0 Course Overview [PPT] [Big Data] [Big Data: Astronomical or Genomical?] [Release 2.0: Issue 11–Big Data]
Thurs, 29 Aug 1 Introduction (Chapter 1) [PPT]
Know Your Data and Data Preprocessing(Chapter 2 & 3)
Tues, 03 Sept 2 Introduction (cont'd) / Know Your Data (Chapter 2) [PPT]
Thurs, 05 Sept 3 Know Your Data (cont'd) HW1
Tues, 10 Sep 4 Data Preprocessing: Data Cleaning, Data Integration, Data Reduction [PPT] [Weka Download] [UCI Datasets]
Thurs, 12 Sep 5 Data Preprocessing: Data Transformation and Discretization [Sample Datasets] [Ch-Square Table.pdf]
Tues, 17 Sep 6a Data Preprocessing: Data Transformation and Discretization(cont'd) RA1 HW1 [Data Preprocessing in Data Mining]
Thurs, 19 Sep 6b Data Preprocessing (cont'd) Using Weka: In-class demo.
Frequent Pattern and Association Rule Mining (Chapter 6)
Tues, 24 Sep 7 HW1 Review, Frequent Pattern Analysis / Frequent item mining methods [PPT] [Sample Dataset]
Thurs, 26 Sep 8 Association Rule Mining HW2
Tues, 01 Oct 9 Association Rule Mining (cont'd) RA1
Thurs, 03 Oct 10 RA Presentation /Association Rule Mining (cont'd) RA2
Classification and Prediction(Chapter 8 &9)
Tues, 8 Oct 11 RA Presentation / Classification HW2 [PPT]
Thurs, 10 Oct 12 Classification (cont'd) [Mining data with random forests]
Tues, 15 Oct 13 Classification(cont'd)/ Decision Trees
Thurs, 17 Oct 14 Decision Trees / Weka Demo. HW3 RA2
Tues, 22 Oct 15 Bayesian Classifiers (Naïve Bayes Classification) / RA Presentation
Thurs, 24 Oct 16 RA Presentation
Tues, 29 Oct 17 Midterm Exam
Text and Web Mining
Thurs, 31 Oct 18 Introduction to Information Retrieval and Vector Space Model CP1 [PPT] [TextBook Excerpt]
Tues, 05 Nov 19 Document Classification/Document Clustering/Naive Bayes HW4 HW3
Classification and Prediction(Chapter 8 &9 cont'd)
Thurs, 07 Nov 20 Neural Networks Classifiers [PPT] [Generative VS Discriminative Classifier]
Tues, 12 Nov 21 Backpropagation / class review CP2
Thurs, 14 Nov
22 Classification Strategy(OVA) and Support Vector Machine CP1 [In Defense of One-Vs-All Classification]
Tues, 19 Nov 23 Other Classification Methods(Nearest Neighbor, Logistic Regression)
Thurs, 21 Nov 24 Improving Classification Accuracy HW4
Nov 25 - Dec 01
Fall Break(No class)
Clustering (Chapter 10)
Tues, 03 Dec 25 Overview of Clustering, Partitioning Approach e.g. k-means, k-medoids CP3 [PPT]
Thurs, 05 Dec 26 Hierarchical Clustering Methods
Tues, 10 Dec 27 Density Based Clustering Methods CP2
Thurs, 12 Dec 28 Grid Based Clustering Methods and Evaluation of Clusters [Davies-Bouldin index] [Silhouette Index]
[Social media mining: Community Evaluation (Chapter 6.3)]
Tues, 17 Dec 29 Final Exam Fall 2023 Exam Schedule
Thurs, 19 Dec 30 Finals week (No class ) CP3

Policies and Help Information


Course Policies

Academic Help

UCCS Policies