New College students who complete the second year of their studies in any area of concentration are encouraged to follow the accelerated 3+2 curriculum provided below, if they are interested in completing both undergraduate and graduate programs in five years. 学生 who are interested in this option will be eligible only after entering the New College undergraduate program and showing strong academic performance. These applicants must satisfy the following minimum conditions before they can be admitted via the 3+2 pathway:

  • Complete 2 years of study with Satisfactory evaluations in all academic undertakings.
  • 完成前提课程(见下文)
  • 由一名教师推荐参加3+2课程
  • 感兴趣? 适用于今天

必备的课程

The following courses must be completed during the first two years of undergraduate study:

  • 数学2400 -微积分1
  • CSCI 2200 - Python编程入门
  • 数学2200 -概率1 (Mod 1)
  • 数学2320 -线性代数
  • STAN 2700 -处理数据I(使用R的部分)

以下课程不是必修课程, 但强烈建议在第三学期结束前参加:

  • 面向对象程序设计
  • 数学4550 -概率2 (Mod 2)
  • 数学3250 -微积分II

IDC 5204 – Applied Statistics I: A statistics course focusing on descriptive and inferential statistics, 以线性回归为主题, 置信区间和假设检验, 包括概率论和重采样等现代方法, with all methods illustrated in R and a focus on methods relevant for data science using industrial datasets.

IDC 5110 – Data Munging and Exploratory Data Analysis: A course on practical approaches for reshaping, 重组, 并通过探索性分析总结数据中的关系. 预处理的原理和方法, 正常化, 包括验证数据, 重点是合作和可重复的研究.

IDC 5120 – Algorithms for Data Science: Fundamentals of algorithms and measures of performance. Python教学, the course includes an exploration of efficient algorithms for sorting and retrieving data, 图算法和组合优化, 动态规划, 随机算法和近似算法.

IDC 5130 – Databases for Data Science: Fundamentals of traditional database design and management. 各种数据库的类型和比较,包括SQL数据库(如. Postgre, SQLite), NoSQL数据库,面向列的数据库(例如. HBase)和面向文档的数据库(例如. MongoDb). Consistency, availability, scalability, efficiency and performance in data retrieval and storage.

IDC 5296 – Industrial Seminar Series I: The first offering of a three-semester long seminar series which hosts professionals and executives as guest speakers from a variety of industrial domains. Each weekly or biweekly seminar covers topics and applications to diverse problems in business via applications of various data science techniques.

IDC 5295 – Industrial Workshops: This course offers content modules complementary to the regular coursework of the graduate program in applied data science. 例子包括, 但不限于, 例如道德, 数据科学中的新兴或趋势技术, 特定领域的应用, 工业软件平台或工具, and professional certification modules and exams widely acknowledged in the industry.

IDC 5205 -应用统计学II:统计建模课程, 包括多元线性回归和逻辑回归, 更广泛地说, 广义线性模型. 重点放在模型的制定上, 建筑, 假设, 解释, 预测和评估, with implementation carried out in R and a focus on methods and models relevant for data science using industrial datasets.

IDC 5112 – Data Visualization: A project-centered introduction to the visual display of quantitative information for both knowledge discovery and the communication of results. 培养学生, 在这学期的课程中, a visual application in their interest with data collected from an industrial application or project.

IDC 5210 – Applied Machine Learning: Project-based course with a coverage of supervised and unsupervised learning and an emphasis on working with real industrial data. Bayesian analysis and other specific learning paradigms including regression, 聚类, 随机森林, 支持向量机, 内核的方法, 神经网络.

IDC 5131 – Distributed Computing: Fundamentals concerning the design and maintenance of massively parallel data sets. 非关系数据库及其管理. Algorithms for parallel architectures and associated software tools including the MapReduce/Hadoop framework and BigTable.

IDC 5297 – Industrial Seminar Series II: The second offering of a three-semester long seminar series that hosts professionals and executives as guest speakers from a variety of industrial domains. Each weekly or biweekly seminar covers topics and applications to diverse problems in business via applications of various data science techniques.

IDC 6293 – Industrial Practicum I: Intended as a summer internship or interterm applied project, this course is the first extensive real industry experience opportunity offered to students who would like to put their data science knowledge and skills to practical use. Must be completed with an industrial partner of the program or a company/organization the student chooses to work with, 在数据科学学院的监督下.

IDC 6200 – Advanced Applied Statistics: A second statistical modeling course, 与主题的混合,如广义加性模型, 纵向响应模型, 时间序列模型, 生存分析, 统计学习或贝叶斯统计, 重点关注与数据科学相关的模型. Taught with a project-based focus using real industrial data in an applied business context.

IDC 6215 -高级应用计算:计算的高级主题, 包括图像处理和目标检测等主题, 文本挖掘, 自然语言处理, 循环神经网络, 强化学习. Taught with a project-based focus using real industrial data in an applied business context.

IDC 6250 – Practical Data Science: Analysis of data and creation of a data science pipeline and deliverable for industry. 在小组中工作, students analyze an industry-submitted data set starting with exploratory analysis, 其次是基于统计或机器学习的模型构建, and the construction and presentation of a data product to an industry partner.

IDC 6298 – Industrial Seminar Series III: The third and final offering of a three-semester long seminar series that hosts professionals and executives as guest speakers from a variety of industrial domains. Each weekly or biweekly seminar covers topics and applications to diverse problems in business via applications of various data science techniques.

IDC 6294 – Industrial Practicum II: A full semester working in industry as part of a data science team, while under the weekly supervision of and submitting reports to a Data Science faculty. This is the second and final stage of the industrial practicum where the student works in an industrial partner company or organization or in a company of their choice. 业绩由指导老师和公司主管共同评估.