Purdue UniversityWebsiteAcademic Catalog
Computer ScienceDepartment Website
BS Degree in Data Sciencesource 1source 2
CS Courses
- Problem Solving And Object-Oriented ProgrammingCS 18000 (4)introCS 18000: Problem Solving And Object-Oriented Programming
Problem solving and algorithms, implementation of algorithms in a high level programming language, conditionals, the iterative approach and debugging, collections of data, searching and sorting, solving problems by decomposition, the object-oriented approach, subclasses of existing classes, handling exceptions that occur when the program is running, graphical user interfaces (GUIs), data stored in files, abstract data types, a glimpse at topics from other CS courses.
- Foundations Of Computer ScienceCS 18200 (3)introCS 18200: Foundations Of Computer Science
Logic and proofs; sets, functions, relations, sequences and summations; number representations; counting; fundamentals of the analysis of algorithms; graphs and trees; proof techniques; recursion; Boolean logic; finite state machines; pushdown automata; computability and undecidability.
- Data Structures And Algorithms For DS/AICS 25300 (3)algsCS 25300: Data Structures And Algorithms For DS/AI
This course gives a broad introduction to the most important data structures and algorithms in computer science. The emphasis is on data structures and their use in algorithms relevant for data science and AI and their applications. The course focuses on developing and comparing efficient implementations, assessing suitability of data structures for massive data sets, and understanding effective use, modifications, and extensions.
- Sophomore Development Seminarornot (it's just strongly recommended)CS 29100 (1)specialCS 29100: Sophomore Development Seminar
Presentations by corporate partners about careers in computer science. Presentations by faculty about careers in academia and research. Students learn about upper-division courses, tour research laboratories, and attend job fairs.
- pick 2
Software Engineering ICS 30700 (3)softengCS 30700: Software Engineering IAn introduction to the methods and tools of software engineering; software life cycle; specification and design of software, software testing, cost and effort estimation; laboratory exercises with design, testing, and other tools.
Competitive Programming IICS 31100 (2)algsCS 31100: Competitive Programming IICP2 teaches experienced programmers additional techniques to solve interview and competitive programming problems and builds on material learned in CP1. This includes specific algorithmic techniques such as [shortest paths, topological sort, MST, union find, range queries], advanced algorithms surrounding trees and DAGs, advanced problem types in [dynamic programming, backtracking/simulation, mathematics, string processing], and more. It can be viewed as a programming complement to CS 38100, with some overlap in content.
Competitive Programming IIICS 41100 (2)algsCS 41100: Competitive Programming IIICP3 teaches experienced programmers additional techniques to solve competitive programming problems and builds on material learned in CP1 and CP2. This includes algorithmic techniques in topics such as [network flow, computational geometry, graph matching, NP-hard problems]. Primarily, CP3 prepares students to compete in programming contests, which means most class time is focused on simulating contest environments and teaching teamwork and communication alongside problem practice.
Information SystemsCS 34800 (3)sysCS 34800: Information SystemsFile organization and index structures; object-oriented database languages; the relational database model with introductions to SQL and DBMS; hierarchical models and network models with introductions to HDDL, HDML, and DBTG Codasyl; data mining; data warehousing; database connectivity; distributed databases; the client/server paradigm; middleware, including ODBC, JDBC, CORBA, and MOM.
Introduction To CryptographyCS 35500 (3)mathCS 35500: Introduction To CryptographyAn introduction to cryptography basics: Classic historical ciphers including Caesar, Vigenere, and Vernam ciphers; modern ciphers including DES, AES, Pohlig-Hellman, and RSA; signatures and digests; key exchange; simple protocols; block and stream ciphers; network-centric protocols.
Introduction To The Analysis Of AlgorithmsCS 38100 (3)algsCS 38100: Introduction To The Analysis Of AlgorithmsTechniques for analyzing the time and space requirements of algorithms. Application of these techniques to sorting, searching, pattern-matching, graph problems, and other selected problems. Brief introduction to the intractable (NP-hard) problems.
Software TestingCS 40800 (3)softengCS 40800: Software TestingPreliminaries: errors and testing; software quality, requirements, behavior, and correctness; testing, debugging, verification; control flow graphs, dominators; types of testing; Test selection: from requirements, finite state models, and combinatorial designs; regression testing and test minimization; Test adequacy assessment: control and data flow; mutation based; testing tools.
Introduction To Data VisualizationCS 43900 (3)sysCS 43900: Introduction To Data VisualizationThe course offers an introduction to the fundamentals principles, design strategies, and techniques needed to visually communicate, explore, and analyze data. The course focuses primarily on the visual representation of inherently non-spatial data (e.g., tables and spreadsheets, graphs and networks, trees, text, and time series), but also considers the visualization of maps and of data in geospatial context.
Introduction To Relational Database SystemsCS 44800 (3)sysCS 44800: Introduction To Relational Database SystemsAn in-depth examination of relational database systems including theory and concepts as well as practical issues in relational databases. Modern database technologies such as object-relational and Web-based access to relational databases. Conceptual design and entity relationship modeling, relational algebra and calculus, data definition and manipulation languages using SQL, schema and view management, query processing and optimization, transaction management, security, privacy, integrity management.
Introduction To RoboticsCS 45800 (3)aiCS 45800: Introduction To RoboticsAny intelligent robot system interacting with our environment needs to have perception, planning, and control methods in its cognition process. The perception module outlines the robot’s procedures to gather and interpret sensory observations into world models. The underlying planning and control modules use those world models to plan robot behaviors and their interaction with our natural environments. Therefore, this course will cover the fundamental topics in robot perception, planning, and control to design general-purpose robot cognition algorithms. Overall, this course is divided into four modules: Robot perception: This covers fundamental techniques needed for robot localization and mapping from raw 3D sensory data. Robot planning: This module will discuss robot behavior planning techniques such as A*, RRT*, and trajectory optimization. Robot Control: This introduces basic control techniques such as PID controller to execute the robot’s planned behaviors in the real world. Robot Learning: This part will briefly introduce machine learning techniques for robot decision-making and control.
Introduction To Artificial IntelligenceCS 47100 (3)aiCS 47100: Introduction To Artificial IntelligenceStudents are expected to spend at least three hours per week gaining experience with artificial intelligence systems and developing software. Basic problem-solving strategies, heuristic search, problem reduction and AND/OR graphs, knowledge representation, expert systems, generating explanations, uncertainty reasoning, game playing, planning, machine learning, computer vision, and programming systems such as Lisp or Prolog.
Web Information Search And ManagementCS 47300 (3)sysCS 47300: Web Information Search And ManagementThis course teaches important concepts and knowledge of information retrieval for managing unstructured data such as text data on Web or in emails. At the same time, students will be exposed to a large number of important applications. Students in the course will get hands on experience from homework and a course project. The first part of the course focuses on general concepts/techniques such as stemming, indexing, vector space model, and feedback procedure. The second part of the course shows how to apply the set of techniques on different applications such as Web search, text categorization, and information recommendation.
Human-Computer InteractionCS 47500 (3)humansCS 47500: Human-Computer InteractionThe goal of this course is to teach students how to design useful and usable interactive systems that address important needs of people. Students will experience the entire user-centered design life cycle, from need finding to usability evaluation. Topics covered in the course include user-centered design principles, usability heuristics, need-finding methods such as semi-structured interviews and contextual inquiry, quick prototyping techniques, usability evaluation methods such as hallway testing and human-subjects user study, and theories about user interaction and decision making. As we are entering a new era of AI, the course will also include a brief introduction on how to apply the HCI principles and techniques to AI-powered systems. This course is project-based. Students will form project teams among themselves to work on a semester-long project and apply the user-centered design principles, theories, and techniques that they have learned in class to build a useful and usable interactive system such as a mobile application. This course is also highly interactive, including a series of design studios and in-class activities that require active participation, communication, and discussion with other students.
Introduction To The Theory Of ComputationCS 48300 (3)theoryCS 48300: Introduction To The Theory Of ComputationTuring machines and the Church-Turing thesis; decidability; halting problem; reducibility; undecidable problems; decidability of logical theories; Kolmogorov complexity; time classes; P, NP, NP-complete; space classes; Savitch’s theorem, PSPACE-completeness, NL-completeness; hierarchy theorems; approximation theorems; probabilistic algorithms; applications of complexity to parallel computation and cryptography. Typically offered Fall Spring.
- Data Mining And Machine LearningCS 37300 (3)aiCS 37300: Data Mining And Machine Learning
This course will introduce students to the field of data mining and machine learning, which sits at the interface between statistics and computer science. Data mining and machine learning focuses on developing algorithms to automatically discover patterns and learn models of large datasets. This course introduces students to the process and main techniques in data mining and machine learning, including exploratory data analysis, predictive modeling, descriptive modeling, and evaluation.
- Python ProgrammingCS 38003 (1)introCS 38003: Python Programming
This course teaches the Python programming language assuming that students have already taken a course in computer programming. This 5-week one-credit course teaches the Python language, the most common modules used in Python, as well as how to write Python web applications.
- Junior Resources Seminarornot (it's just strongly recommended)CS 39100 (1)impactCS 39100: Junior Resources Seminar
This seminar course engages a number of outside speakers who typically present information on the role of research in computer science, how the research components of computer science relate to each other, approaches to software development in industry, different types of application development paradigms, technological trends, and societal, ethical, and legal issues. The credit may be used only toward free electives.
- Large Scale Data AnalyticsCS 44000 (3)aiCS 44000: Large Scale Data Analytics
This course provides an integrated view of the key concepts of modern algorithmic data analytics. It focuses on teaching principles and methods needed to analyze large datasets in order to extract novel, transformative insights for the underlying application. The course emphasizes the duality between formulating questions that can be answered by statistical data analysis tools (the statistical perspective) and the algorithmic challenge of actually extracting such answers using available parallel and distributed computational resources from massive datasets. The topics cover three areas: (1) algorithmic concepts necessary for big data analytics, (2) big data systems, including data management and programming, and (3) advanced analytic methods to address characteristics of real-world big data problems.
- Data Science CapstoneCS 44100 (3)capstoneCS 44100: Data Science Capstone
The Capstone course aims at providing students with an opportunity to integrate their accumulated knowledge and technical and social skills in order to identify and solve realistic or real-world data science problem, with an emphasis on the application domain. Capstone projects are often sponsored by corporate partners or by academic or non-academic research groups. The Capstone course serves as preparation for students entering into the profession of Data Science. Students will conduct a team-based project through the entire data science pipeline, by following the six phases of the CRISP-DM (Cross-Industry Standard Process for Data Mining) methodology. Students get experience in working as teams, participating in project planning, writing reports, and giving presentations.
Math/Stat Courses
- orPlane Analytic Geometry And Calculus IMA 16100 (5)mathMA 16100: Plane Analytic Geometry And Calculus I
Introduction to differential and integral calculus of one variable, with applications. Some schools or departments may allow only 4 credit hours toward graduation for this course. Designed for students who have not had at least a one-semester calculus course in high school, with a grade of “A” or “B”. Not open to students with credit in MA 16500. Demonstrated competence in college algebra and trigonometry.
Plane Analytic Geometry And Calculus IIMA 16200 (5)mathMA 16200: Plane Analytic Geometry And Calculus IIContinuation of MA 16100. Vectors in two and three dimensions, techniques of integration, infinite series, conic sections, polar coordinates, surfaces in three dimensions. Some schools or departments may allow only 4 credit hours toward graduation for this course.
Analytic Geometry And Calculus IMA 16500 (4)mathMA 16500: Analytic Geometry And Calculus IIntroduction to differential and integral calculus of one variable, with applications. Conic sections. Designed for students who have had at least a one-semester calculus course in high school, with a grade of “A” or “B”, but are not qualified to enter MA 16200 or MA 16600 or the advanced placement courses MA 17300 or the honors calculus course MA 18100. Demonstrated competence in college algebra and trigonometry.
- Introduction To Data ScienceSTAT 24200 (3)mathSTAT 24200: Introduction To Data Science
This course provides a broad introduction to the field of data science. The course focuses on using computational methods and statistical techniques to analyze massive amounts of data and to extract knowledge. It provides an overview of foundational computational and statistical tools for data acquisition and cleaning, data management and big data systems. The course surveys the complete data science process from data to knowledge and gives students hands-on experience with tools and methods. Basic knowledge of Python required.
- orHonors Multivariate CalculusMA 27101 (5)mathMA 27101: Honors Multivariate Calculus
This course is the Honors version of MA 26100, Multivariate Calculus; it will also include a review of infinite series. The course is intended for first-year students who have credit for Calculus I and II. There will be a significant emphasis on conceptual explanation, but not on formal proof. Permission of department is required.
- Statistics For Data ScienceSTAT 35500 (3)mathSTAT 35500: Statistics For Data Science
An introduction to methodologies for data analysis and simulation. Populations and sampling. Distributions and summaries of distributions. Algorithms for sampling and resampling. Foundational statistical concepts including confidence intervals, hypothesis testing, correlation. Introduction to classification and regression. Essential use is made of statistical software throughout.
- ProbabilityMA 41600 (3)mathMA 41600: Probability
An introduction to mathematical probability suitable as a preparation for actuarial science, statistical theory, and mathematical modeling. General probability rules, conditional probability and Bayes theorem, discrete and continuous random variables, moments and moment generating functions, joint and conditional distributions, standard discrete and continuous distributions and their properties, law of large numbers and central limit theorem.
- Statistical TheorySTAT 41700 (3)mathSTAT 41700: Statistical Theory
An introduction to the mathematical theory of statistical inference, emphasizing inference for standard parametric families of distributions. Properties of estimators. Maximum likelihood estimation. Sufficient statistics. Hypothesis tests and confidence intervals. Distribution theory for common statistics based on normal distributions, including linear regression. Bayesian Statistics include posterior inference, posterior mean, maximum a-posteriori estimator, credible intervals, and Bayesian hypothesis testing.
- Introduction To Time SeriesorSTAT 42000 (3)mathSTAT 42000: Introduction To Time Series
An introduction to time series analysis suitable for students of actuarial science, engineering, and the sciences. Model building and forecasting with ARMA and ARIMA models. Basic financial volatility models (ARCH and GARCH). Resampling methods for confidence intervals. Basics of spectral analysis, including spectral density estimation and periodograms.
Elementary Stochastic ProcessesorMA 43200 (3)mathMA 43200: Elementary Stochastic ProcessesAn introduction to some classes of stochastic processes that arise in probabilistic models of time-dependent random processes. The main stochastic processes studied will be discrete time Markov chains and Poisson processes. Other possible topics covered may include continuous time Markov chains, renewal processes, queueing networks, and martingales.
Statistical Programming And Data ManagementorSTAT 50600 (3)mathSTAT 50600: Statistical Programming And Data ManagementUse of the SAS software system for managing statistical data. How to write programs to access, explore, prepare, and analyze data. Using the DATA step and procedures to access, transform, and summarize data. Introduction to the SAS macro language. Prepares students for the base SAS certification exam.
Applied Regression AnalysisorSTAT 51200 (3)mathSTAT 51200: Applied Regression AnalysisInference in simple and multiple linear regression, residual analysis, transformations, polynomial regression, model building with real data, nonlinear regression. One-way and two-way analysis of variance, multiple comparisons, fixed and random factors, analysis of covariance. Use of existing statistical computer programs.
Statistical Quality ControlorSTAT 51300 (3)mathSTAT 51300: Statistical Quality ControlA strong background in control charts including adaptations, acceptance sampling for attributes and variables data, standard acceptance plans, sequential analysis, statistics of combinations, moments and probability distributions, applications.
Design Of ExperimentsorSTAT 51400 (3)mathSTAT 51400: Design Of ExperimentsFundamentals, completely randomized design; randomized complete blocks; latin square; multi-classification; factorial; nested factorial; incomplete block and fractional replications for 2n, 3n, 2m x 3n; confounding; lattice designs; general mixed factorials; split plot; analysis of variance in regression models; optimum design. Use of existing statistical programs.
Sampling And Survey TechniquesorSTAT 52200 (3)mathSTAT 52200: Sampling And Survey TechniquesThis course covers basic sampling design and analysis techniques. Sampling designs include: simple random, stratified, clustered, multi-staged, and systematic samples. Methods of estimation appropriate to design features and efficiency and costs related to sample design are covered.
Intermediate Statistical MethodologySTAT 52500 (3)mathSTAT 52500: Intermediate Statistical MethodologyStatistical methods for analyzing data based on general/generalized linear models, including linear regression, analysis of variance (ANOVA), analysis of covariance (ANCOVA), random and mixed effects models, and logistic/loglinear regression models. Application of these methods to real-world problems using SAS statistical software.