Site-wide links

Data Mining/Data Analytics

The certificate in Data Mining/Data Analytics is designed to enhance knowledge on how to analyze large data sets coming from fields such as healthcare, banking, retail, government and manufacturing.

Individuals may customize a certificate by working with an advisor to develop a plan that meets their needs.

Each certificate requires completion of four classes, as noted in the following list.

  • Data Mining/Data Analytics I
  • Data Mining/Data Analytics II
  • Multivariate Statistics
  • Forecasting

The following additional introductory courses are available and may be necessary for individuals with limited background:

  • Basic Statistics
  • Introduction to R
  • Introduction to SAS

Each course is fifteen hours including online lectures, discussion boards and/or chat sessions taken over a five-week period.

Each course may be taken individually and qualify for CEUs.

For more information, contact Greg Evershed (gmecqa@rit.edu) at (585) 475-5442.

 

Data Mining/Predictive Analytics I

Brief Description: This course provides a hands-on practical introduction to some of the modern methods of statistical data mining, with a concentration on the use of such techniques on the so-called big (massive) data. Anyone taking this course will be presented with practical examples of big data coming from all walks of life, along with the statistical data mining methods most commonly used to extract meaning patterns from them. Although we do touch on model selection for interpretation, our main focus on this course will be optimal prediction. Our main computing language for this course is R.

Topics:

  • Presentation through examples, of a Taxonomy for Big (Massive) Data, featuring Short Fat Data, Tall and Skinny Data, and Tall and Fat Data delivered in droves
  • Introduction to the DISCO approach to Data Mining
  • Introduction to Modern Methods of Unsupervised Learning featuring hands-on exploration of Cluster Analysis and Dimensionality Reduction
  • Practical Introduction to a Predictive Analytics Approach to Regression Analysis
  • Practical Data Mining Look at Logistic Regression
  • Practical Exploration of Discriminant Analysis Classification in the context of Big Data
  • Practical Introduction to Receiver Operating Characteristics (ROC) curves
  • Practical Introduction to Classification and Regression Trees (CART)
  • Practical Introduction to k-Nearest Neighbors approach to Classification and Regression
  • Practical introduction to Resampling Techniques like Cross Validation, Random Subsampling and Bootstrap for Optimal Predictive Analytics

Target participants:  Analysts, managers, scientists, engineers and/or anyone else interested in gaining deeper insights into the most commonly used statistical data mining and machine learning methods

Prerequisites:  Participants should have a working knowledge of regression analysis, and at least an introductory level competency in the R computing package or SAS.

Data Mining/Data Analytics II

Description coming soon.

 

Applied Multivariate Analysis

Brief Description: Modern Statistics and Data Analysis continually have to deal with datasets where either the response variable and/or the explanatory is/are multivariate in nature. It’s also sometimes the case that no response is even available, so that the nature of the statistical task is to extract/uncover or discover patterns underlying the observed explanatory vector. One of the pervading assumptions here is that when the dimension observed is large, there must be a lower latent dimensional space that explains the observed phenomenon. This course intends to provide the student with practical hands on tools for exploring, visualizing and formally analyzing datasets that are multivariate in structure. We touch on fundamentals of dimensionality reduction, multivariate analysis of variance and elements of cluster analysis, just to name a few.

Topics:

  • Fundamental Tools for Representation and Analysis of Multivariate Data
  • Modern Descriptive Analysis Methods for Multivariate Data and Correlation analysis
  • Fundamentals of Visualization of Multivariate Data
  • Case-studies based exploration of Multivariate Analysis of Variance (MANOVA)
  • Practical Hypothesis Testing for Multivariate Data
  • Practitioners  Principal Component Analysis (PCA)
  • Singular Value Decomposition (SVD) for Data Compression and Beyond
  • Fundamentals of Cluster Analysis and a Practitioner’s Look at kMeans Clustering
  • Hierarchical Clustering and the Visual Appeal of Dendrograms
  • A hands introduction to Canonical Correlation Analysis
  • The Good, the Bad and the Ugly of Factor Analysis
  • A fun look at Multidimensional Scaling

Target participants: Analysts, managers, scientists, engineers and/or anyone else interested in gaining deeper insights into the most commonly used statistical methods for multivariate data

Prerequisites: Participants should have a working knowledge of regression analysis, and at least an introductory level competency in the R computing package or SAS.

Forecasting Analysis

Brief Description:  Based on its past sales figures, a company might be interested. An even more sophisticated analyst might want to use covariates in a regression-like manner to forecast the price of a stock on NYSE as accurately and precisely as possible. Forecasting is both philosophically and scientifically very hard, and sometimes not even well-posed. In this course, we provide an example-driven introduction to the fundamental statistical tools used in modern forecasting.

Topics:

  • Introduction to the basics of forecasting through various examples
  • Understanding the difference between forecasting and other forms of prediction
  • Brief Introduction to the importance of stationarity
  • Challenges and Difficulties Long term forecasting/over extrapolation
  • Fundamental components of a typical time series Trend, Seasonality, Cycles, Error
  • Hands on introduction to the Box-Jenkins Decomposition
  • Practical Introduction the Autoregressive Integrated Moving Average (ARIMA) model
  • Practical Forecasting using R

Target participants: Analysts, managers, scientists, engineers and/or anyone else interested in gaining deeper insights into the most commonly used statistical methods for forecasting

Prerequisites: Participants should have a working knowledge of regression analysis, and at least an introductory level competency in the R computing package or SAS.

 

  Rochester Institute of Technology
One Lomb Memorial Drive,
Rochester, NY 14623-5603
Copyright © Rochester Institute of Technology, All Rights Reserved. | Disclaimer | Copyright Infringement