Data Mining
- Lecturer (Coordinator):
- Javier Segovia
- fsegovia@fi.upm.es
- Lecturer:
- Ernestina Menasalvas
- emenasalvas@fi.upm.es
Semester
Second semester
Credits
4 ECTS
Outline
This subject will detail the techniques, development processes, models and challenges in data mining project development. An impressive 60% of business intelligence projects are abandoned or fail due to inadequate planning, incomplete tasks, missed deadlines, poor project management, business requirements non-delivery, poor quality deliverables.
Any business intelligence project involves the development of a data mining project designed to discover "business intelligence". Many data mining project development process models have been proposed. It is evident that, despite all the research and projects conducted, the way in which data mining projects are developed today is still more like an art than an engineering process. Data mining experts automatically translate business requirements into goals and data mining techniques. This means that projects are fully dependent on their developers. If the data mining expert leaves, the project will fail as he or she will not have stipulated or documented the steps to be taken.
In the course, a methodology will be learned to convert business objectives into data analysis objectives, and the basic techniques of statistics and Artificial Intelligence will also be learned to achieve these objectives by practising with a professional Data Mining tool and real databases.
Learning Goals
- Know examples of real applications and trends and lines of research
- Manage software applications to perform data mining tasks
- Understand the fundamentals and apply a wide and varied repertoire of clustering algorithms, estimation, prediction and classification
Syllabus
- Introduction to data engineering
- The tool: IBM SPSS modeler
- Descriptive, Diagnostic, Predictive and Prescriptive Analysis
- RFM analysis
- Clustering
- Linear regression
- Logistic regression
- Nearest neighbour
- Decision trees
- Neural networks
- Ensemble methods
- Association rules
- Dealing with time
Recommended reading
- D. Hand: "Principles of Data Mining (Adaptive Computation and Machine Learning)", MIT Press, 2001
- J. Han, M. Kamber: "Data Mining: Concepts and Techniques", Morgan Kaufmann, 2006
- M. J. A. Berry, G. Linoff: "Data Mining Techniques: for Marketing, Sales and Customer Support", John Wiley & Sons, 1997
- P. Tan, M. Steinbach, V. Kumar: "Introduction to Data Mining", Pearson Addison Wesley, 2005
- I. Witten, E. Frank, M. Hall: "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann, 2011
Prerequisites
- Artificial Intelligence
- Estadística
Tuition language
English
Subject-Specific Competences
Code | Competence | Proficiency Level |
---|---|---|
CEM7 | Evaluation and application of diverse mathematical and statistical theories, and available knowledge extraction and discovery processes, methods and techniques for large data volumes | P |
Learning Outcomes
Code | Learning Outcome | Associated competences | Proficiency level |
---|---|---|---|
RA-APDI-19 | Be able to perform data mining through a process, demonstrating competence to a standard including the phases of business insight, data insight, exploratory data analysis, modelling, evaluation and exploitation | CEM7 | P |
RA-APDI-21 | Understand the foundations and apply a broad and wide-ranging repertory of clustering, estimation, prediction and classification algorithms | CEM7 | P |