## S675: High-Dimensional Data

### Class Description

S675 explores a variety of methods for detecting structure in multivariate data sets. Major topics include dimension reduction (principal component analysis, multidimensional scaling, manifold learning), unsupervised learning (k-means clustering, spectral clustering), and supervised learning (linear discriminant analysis, support vector machines, nearest neighbor classification).

### Class Information

**School/Department: **Statistics

**Semester(s): **Fall

**Year(s) Offered: **_2018

**Class time: **
Monday, Wednesday, Friday : 9:05A - 9:55A

**Instructor: **Michael Trosset - mtrosset@indiana.edu

### Other Details

**Prerequisites: ** Stat S520 and Stat S 640 or CSCI B 555 or CSCI B 565 or permission of instructor

**Algebra Required?: ** Used extensively throughout the course, including proofs and homework assignments

**Calculus Required?: ** Used primarily for concepts and derivations

**Substantive Orientation: ** Any discipline that is concerned with high-dimensional data. Such data can arise in various ways, often as multiple measurements on each of several objects/subjects, as in text mining of microarray experiments, but also as measurements of pairwise proximi

**Software Used: **
R

**How the software is used: ** Students write programs that implement the methods studied in S675. They use these programs and/or programs written by others to analyze data

**Problem Sets: ** Weekly

**Data Analysis: ** Yes, but primary emphasis is on understanding how the methods work. Small, synthetic data sets often serve this objective better than large, real data sets

**Keywords: ** machine learning, multivariate structure, dimension reduction, cluster analysis, classification