²ÝÝ®ÊÓÆµ

Overview
Overview

The Master of Science in Computational Data Science (MSc CODS) combines computer science algorithms and statistical techniques to analyze and understand the information hidden in big data, such as those generated by financial and health services, so as to optimize and grow the various economic sectors, devise strategies and enable leaders to make data-driven decisions. MSc CODS program equips graduates with the knowledge and skills to join data and technology intensive industries, including but not limited to investments and finance, telecommunications, national security, healthcare, energy, manufacturing, and utilities.

Career Opportunities

A master’s degree in Computational Data Science from Khalifa University helps open many career opportunities for future success. The program combines computer science algorithms and statistical techniques to analyze and understand the information hidden in big data, such as those generated by financial, and health services, to optimize and grow the various economic sectors, devise strategies and enable leaders to make data-driven decisions. The Computational Data Science program equips graduates with the knowledge and skills to join data and technology intensive industries, including but not limited to investments and finance, telecommunications, national security, healthcare, energy, manufacturing, and utilities. The MSc in CODS program at Khalifa University offers students an excellent opportunity for interdisciplinary education, which will help them fulfil the requirement of these career paths. Graduates also go through rigorous training and research experience to enable them to pursue their studies at PhD level.

Program Educational Objectives
Course Descriptions

CODS 608 – Distributed Sys and Could Comp (3-0-3)

This course teaches in-demand technologies for distributed and parallel computation as well as storing and processing large amounts of data using cloud-computing technologies. While underlying network and architecture issues are discussed to the extent that enables a basic understanding, particular focus is on the data science aspects of Cloud computing and cloud applications complementary to other courses related to the realm of Data Science and Artificial Intelligence. It introduces general concepts and deploys the state-of-the-art systems from public cloud systems, but also instructs how to use locally available clouds.

CODS 610 – Model Estimation (3-0-3)

This course provides a rigorous introduction to statistical modeling. The topics covered include classical regression, nonparametric regression, penalized estimation, covariance parameters estimation, multivariate linear model, discrimination and allocations and principal component analysis.

CODS 612 – Computational Methods and Optimization in Finance (3-0-3)

This course introduces the main classes of optimization problems (linear, quadratic, convex, integer, stochastic, and robust) and the algorithms to efficiently compute the optimum in each case. The methods will be applied to financial problems such as asset/liability management, option pricing and hedging, risk management, and portfolio optimization. The students will learn to use software related to each technique.

CODS 620 – Advanced Statistical Inference (3-0-3)

This course provides a rigorous introduction to classical statistical inference. Probabilistic concepts and tools are used to present inferential statistics methods, including sampling distributions, parametric point estimators and their properties, interval estimation, hypothesis testing and regression models. Students will study some elements of Bayesian statistics.

CODS 622 – Data Science with Machine Learning (3-0-3)

This graduate-level course on data science builds upon the undergraduate courses on “Data Analytics” and on “Introduction to Machine Learning”. The course starts by introducing data analytics tasks like regression, classification and forecasting from the data science perspective and then shows how advanced ML techniques can be used to perform them. Topics include advanced clustering, time-series prediction, statistical learning theory, ensemble learning, probabilistic learning, dimension reduction, semi-supervised learning, transfer learning, etc.

CODS 623 – Health Data Science (3-0-3)

This course provides an introduction to Health Data Science, with special emphasis on developing knowledge and competencies necessary understand the measurement and use of variables in health, their scales of measurement, and its use in biostatistics and Spatial Epidemiology.

CODS 624 – Space-Time Data Science (3-0-3)

Space-time data are becoming available in overwhelming volumes and diverse forms as a result of growing remote-sensing capabilities, ground-based sensor networks, crowdsourcing, citizen science data, climate models, and novel medical sensing technologies.  Dealing with massive data sets having complex structures implies a collection of conceptual, methodological, and technical challenges, which are exacerbated by the data diversity. Space-time statistical methods were not designed to deal with global, high-volume, hyper-dimensional, heterogeneous and uncertain space-time data. In fact, the computational requirements of most available methods scale poorly with data size.  Space-Time Data Science (STDS throughout) is based on the integration of Statistics, Computer Science and Machine Learning as fundamental vertices in a graph structure to be then synchronized with applied sciences, such as geography, physics, soil science, neuroscience, and epidemiology. Hence, the key of success of STDS is to be able to tailor interdisciplinary approaches to the analysis of diverse and big space-time data. This course will introduce the statistical and computational aspects of STDS.

CODS 626 – Financial Derivatives and Risk Management (3-0-3)

The concept of Financial Derivatives, which includes options, futures and forwards, is crucial for risk management, speculation, and for arbitrage activities. This course covers the foundational theory in derivatives valuation and risk management from the mathematical modelling point of view. It demonstrates strengths and weaknesses of different models. It also illustrates and exemplifies how valuation models and risk measures are applied in the financial industry.

CODS 630 – Advanced Computer Networks (3-0-3)

Modern and popular computer network technologies, protocols and services. Next Generation Networks, Triple-play services, Network management, Firewall and Intrusion detection, Wireless ad-hoc networks.  Performance analysis, modeling and simulation of computer networks.

CODS 631 – Blockchain Fundamentals and Applications (3-0-3)

Introduction to cryptocurrencies, wallets, and Blockchain; Blockchain key features, benefits, and popular use cases; Blockchain  fundamentals, protocols, algorithms, and underlying infrastructure Building Ethereum and Hyberledger blockchains; Decentralized applications (DApps); Smart contracts; Trusted Oracles; Decentralized storage; Designing and architecting blockchain-enabled systems and solutions for applications in IoT, AI, Supply Chain Management  and Logistics, Healthcare, Smart Grids, 5G networks, Telecommunication, etc. Cost and Security Analysis; Limitations and open research challenges in Blockchain.

CODS 634 – Artificial Intelligence (3-0-3)

This course is a graduate-level introduction to the field of artificial intelligence (AI). It aims to give students a solid understanding of the main abstractions and reasoning techniques used in AI. Topics include: representation and inference in first-order logic; modern deterministic and decision-theoretic planning techniques; Bayesian network inference and (Deep) Reinforcement Learning.

CODS 635 – Deep Learning Systems Design (3-0-3)

High level introduction to deep learning concepts and essential contexts, deep learning computational framework, system  implementation practicalities,  machine learning workflow, practical classification problems for different data modalities, state of the art deep learning models.

CODS 636 – Introduction to High Performance Computing (3-0-3)

The goal of this graduate-level course is to provide students with the fundamentals of high-performance computing (HPC), their programming paradigms, and their applications to the data sciences and big data analytics. The course starts with motivational examples on the need for HPC in data sciences. Next, the students are introduced to HPC programming methods for distributed and shared memory systems, including message passing (MPI), multithreading (pThreads) and openMP.  The course will also cover HPC methods based on data parallelism using single-instruction-multiple-data (SIMD) computing and Graphical-Processing Unit (GPU) accelerators. The HPC opportunities of cloud computing services will also be addressed, especially in their relation to business data analytics. Application examples from machine learning, business intelligence, econometrics and finance will illustrate the wide applicability of HPC to problems of practical importance. The students will use the HPC resources provided by KU Research Computing for their assignments and projects.

CODS 637 – GPU Programming (3-0-3)

This course is a hands-on introduction to parallel computing for MSc students with emphasis on the most common and accessible parallel architecture, namely, the Graphics Processing Unit (GPU). The course will introduce students to modern GPU architectures and the fundamental concepts of parallel computing, including data parallelism, scalable execution, memory and data locality, multithreading, and synchronization. The course will also cover some of the most common parallel patterns such as convolution, prefix sum, graph search, and sparse matrix multiplications, along with their GPU implementations. The case study of deep convolutional neural networks will be covered in detail. NVIDIA’s CUDA programming environment will be used throughout the course for homework assignments and the course project.

CODS 640 – Financial Cyber Security (3-0-3)

The course examines techniques to achieve security of financial systems within companies, with special reference to bank and finance organizations. Students analyze financial systems breaches, and learn common threats and frauds specifically related to financial systems. Several methods of cyber security risk assessment are explored, as well as the design of risk alleviation strategies, including choosing and designing technical and process security controls for fintech. Students analyze financial services industry regulation and discuss bank and finance compliance requirements.

CODS 641 – Natural Language Proc. & Info. Retrieval (3)

The course will introduce the core concepts of Natural Language Processing such as word tokenization, Part of Speech Tagging, Vector Space representations and text related classification tasks using Machine Learning. High practical accessibility of the course is ensured through programming/project activities; in particular, the Python libraries NLTK, Gensim and CAMeL will provide a tight connection between conceptualization and practical applications.

CODS 642 – Database Systems Concepts and Design (3-0-3)

This course is on the design and implementation of database management systems. Topics include data models (relational, document, key/value), storage models (n-ary, decomposition), query languages (SQL, stored procedures), storage architectures (heaps, log-structured), indexing (order preserving trees, hash tables), transaction processing (ACID, concurrency control), recovery (logging, checkpoints), query processing (joins, sorting, aggregation, optimization), and parallel architectures (multi-core, distributed).

CODS 643 – Mobile and Pervasive Computing (3-0-3)

The course introduces Mobile and Pervasive Computing and its enabling technologies. Topics include architecture and organization for mobile/real-time applications, Design of location and context-aware adaptive applications, the mobility/data/privacy management in mobile computing. Emerging technologies related to mobile and pervasive Computing such as crowdsourcing and IoT will be discussed.

CODS 644 – Data Science for Business Applications (3-0-3)

This graduate-level course on data science builds upon the core course on “Data Science with Machine Learning”. The course will cover the applications of data science in business domain ranging from manufacturing to social media/networks. The security, privacy, ethical, and legal issues of data science will also be covered.

CODS 645 – Financial Machine Learning (3-0-3)

Digitalization, automation and Machine Learning have taken pivotal roles in data intensive fields, and Finance is no exception. In the era of the Exa-byte, financial operations like high frequency trading, risk monitoring and stock option processing benefit highly from rational decision making. This course aims to bridge the divide between academia and industry, by thoroughly introducing and implementing the complexity of Machine Learning applications to investments.

CODS 650 – Data Processing and Visualization (3-0-3)

The course will cover various topics on the upstream data processing tasks such as acquisition, warehousing and storage, clean-up, manipulation, and feature engineering/selection, projection, and signal processing. It will also cover the topics on the downstream data visualization and understanding tasks such as storytelling, presentation techniques, visualization concepts and tools, and explainable artificial intelligence.

CODS 694 – Selected Topics in Computational Data Science (3-0-3)

This course covers selected contemporary topics in Computational Data Science.  The topics will vary from semester to semester depending on faculty availability and student interests. Proposed course descriptions are considered by the Electrical and Computer Science Department and the Mathematics Department on an ad hoc basis and the course will be offered according to demand. The proposed course content will need to be approved by the Graduate Studies Committee. The course may be repeated once with change of contents for the student to earn a maximum of 6 credit hours.

CODS 697 – CODS Graduate Project (3-0-3)

The project will provide students with an opportunity to apply knowledge gained from CODS program courses by working on a real-world computational data science project. The projects typically involves analysis of various types of data such as financial data, social media data, health data, and network data. Working on the project will enable students to gain valuable hands-on experience. They will also get first-hand experience in planning, implementing, documenting, and presenting their CODS work. Each groups typically consist of three students.

CODS 699 – Master’s Thesis (9)

In the Master’s Thesis, the student is required to independently conduct original research-oriented work related to important Computational Data Science problems under the direct supervision of a main advisor, who must be a full-time faculty in either the Electrical Engineering and Computer Science Department or Mathematics Department, and at least one other full-time faculty who acts as co-advisor. The outcome of the research should demonstrate the synthesis of information into knowledge in a form that may be used by others and lead to publications in suitable reputable journals/conferences. The student’s research findings must be documented in a formal thesis and defended through a viva voce examination. The student must register for a minimum of 9 credit hours of Master’s Thesis.

Program Structure and Requirements
Study Plan