Course Overview
Effective from AY2024/25 Trimester 2 (January 2025), the Postgraduate Certificate in Data Engineering and Smart Factory will be retired. For more details, please refer to the full notice here.
This module aims to equip the you with skills ranging from data wrangling, big data processing to machine learning.
Upon completion of the course, you will be equipped with the necessary skills to excel in an entry-level position in data engineering.
You will have the ability to confidently carry out exploratory data analysis using Python, design and create both SQL and NoSQL databases, create ETL (Extract Transform Load) data pipelines in Apache Spark, create supervised machine learning models in sklearn, and perform unsupervised techniques such as clustering.
With this comprehensive set of skills, you will be prepared to take on the challenges of the data engineering industry.
Who Should Attend
- Engineers
- Software Developers
- Professionals who have experience in programming and are interested to find out more about data engineering
What You Will Learn
Data Wrangling
- Participants will be able to apply data wrangling techniques, using libraries such as Numpy and Pandas to transform data from one form to another
SQL
- Participants will be taught basic SQL and they will be able to write and debug simple SQL queries on a database for CRUD (Create Retrieve Update Delete) operations. Participants will also be taught principles of database design (normal forms)
NoSQL
- Participants will be taught the difference between SQL and NoSQL databases and be able to contrast the situations where each should be used
- Participants will similarly have to be able to carry out CRUD operations on a NoSQL database such as MongoDB
Apache Spark
- Participants will be taught how to create a simple data pipeline consisting of data ingestion, data preparation and generating views / queries
Supervised Machine Learning
- Participants will be able to use tools such as sklearn to create machine learning models using a range of techniques such as decision trees or neural networks
Unsupervised Machine Learning
- Participants will be exposed to unsupervised learning techniques such as k-means and hierarchical clustering
Teaching Team
Soh Cheng Lock, Donny
Associate Professor / Prog Leader, Infocomm Technology, Singapore Institute of Technology
Vivek Balachandran
Associate Professor, Infocomm Technology, Singapore Institute of Technology
Zhang Wei
Associate Professor / Prog Leader, Infocomm Technology, Singapore Institute of Technology
Schedule
Lessons are held every Tuesday.
Day | Topics |
---|---|
Day 1 |
Introduction to data programming and Python |
Day 2 |
Overview of database systems |
Day 3 |
Introduction to NoSQL, REST, and MongoDB CRUD |
Day 4 |
Introduction to Big Data, Hadoop, Apache Spark, RDD, Functional Programming, and Data Pipelines |
Day 5 |
Introduction to supervised machine learning algorithms and clustering through K Nearest Neighbour algorithm |
Day 6 |
Exam (SIT@NYP) |
Certificate and Assessment
A Certificate of Participation will be issued to participants who
- Attend at least 75% of the module
- Undertake non-credit bearing assessment during the module
A Certificate of Attainment will be issued to participants who
- Attend at least 75% of the module
- Undertake and pass credit bearing assessment during the module
Fee Structure
The full fee for this course is S$5,886.00.
Category | After SF Funding |
---|---|
Singapore Citizen (Below 40) | S$1,765.80 |
Singapore Citizen (40 & Above) | S$685.80 |
Singapore PR / LTVP+ Holder | S$1,765.80 |
Non-Singapore Citizen | S$5,886.00 (No Funding) |
Note: All fees above include GST. GST applies to individuals and Singapore-registered companies.
Course Runs
Learning Pathway
Earn a Postgraduate Certificate
New Engineering Micro-credentials Launching Soon!
Exciting news! We are introducing new micro-credentials in Electrical and Electronic Engineering & Infrastructure and Systems Engineering. Be among the first to know by registering your interest today! Register now →