Master in Software and Systems

Data Mining Concepts, Practice and Research Challenges

Lecturer:
Ernestina Menasalvas
emenasalvas@fi.upm.es
Lecturer (Coordinator):
Javier Segovia
fsegovia@fi.upm.es

Semester

Second semester

Credits

4 ECTS

Outline

This subject will detail the techniques, development processes, models and challenges in data mining project development. An impressive 60% of business intelligence projects are abandoned or fail due to inadequate planning, incomplete tasks, missed deadlines, poor project management, business requirements non-delivery, poor quality deliverables.

Any business intelligence project involves the development of a data mining project designed to discover "business intelligence". Many data mining project development process models have been proposed. It is evident that, despite all the research and projects conducted, the way in which data mining projects are developed today is still more like an art than an engineering process. Data mining experts automatically translate business requirements into goals and data mining techniques. This means that projects are fully dependent on their developers. If the data mining expert leaves, the project will fail as he or she will not have stipulated or documented the steps to be taken.

The first question is, What methodology should be followed to transform business goals into data mining goals? Unfortunately there is no such methodology to date. To answer this question, we have to address issues, like How are business goals stated? What is a data mining goal? What types of problems can data mining solve? What do all the problems have in common? What are the requirements for successfully solving a given problem? This subject will deal with the many approaches that try to solve these problems. Converting business intelligence project development from an art into a full-blown engineering discipline entails applying methodologies that conform to this new type of projects. Traditional development practices are inadequate and inappropriate as business intelligence is an evolving area in all organizations, subject to continual changes and improvements based on business community feedback.

Learning Goals

Syllabus

  1. DM process
    1. Understanding the business
    2. Understanding the data
    3. Preprocessing
    4. Modelling
    5. Evaluation
    6. Deployment
  2. Modelling in DM
    1. Association Rules
    2. Classification
    3. Clustering
  3. Case study
    1. Business objectives
    2. Business success criteria
    3. Data mining goals
    4. Data mining success criteria
    5. Data description report
    6. Data exploration
    7. Data selection
    8. Modeling technique description
    9. Test design
    10. Model building
    11. Model description, evaluation and assessment
    12. Plan deployment

Recommended reading:

Prerequisites

Assessment Method

This subject will be assessed by means of the case studies, the reports and attendance and participation.

Tuition language

English

Subject-Specific Competences

More information:

This table shows the code, description and proficiency level for each subject-specific competence

Code Competence Proficiency Level
SSC2 Analysis and synthesis of solutions to problems requiring innovative approaches to the definition of the computational infrastructure, processing and analysis of heterogeneous data types C
SSC7 Evaluation and application of diverse mathematical and statistical theories, and available knowledge extraction and discovery processes, methods and techniques for large data volumes C

Learning Outcomes

More information:

This table shows the code, description and proficiency level for each subject learning outcome

Code Learning Outcome Associated competences Proficiency level
RA-APDI-1 Ability to proficiently apply a standard data mining process, including the business knowledge, data knowledge, data exploration analysis, modelling, evaluation and exploitation phases SSC2, SSC7 C
RA-APDI-2 Use software applications for data mining tasks SSC2, SSC7 C
RA-APDI-3 Understand the foundations and apply a broad and wide-ranging repertory of clustering, estimation, prediction and classification algorithms SSC2, SSC7 C
RA-APDI-4 Be familiar with examples of real applications and research trends and lines SSC2, SSC7 C

Learning Guide

Subject learning guide for Data Mining Concepts, Practice and Research Challenges