We offer multiple courses on Data Science. The 3-day Big Foundation course, 3-day Data cleaning advanced course, 3-day Machine Learning advanced course and 3-day Artificial Intelligence Neural Networks advanced course. Programming pre-requisite in the foundation course is optional. For participants with no programming background in Python, you will start with the 3-day Big Data/Data Science/ML Foundation course. This course will teach you Python programming for analytics. For all other data science courses except the foundation course, Python programming experience is required.
All our data science related courses are taught by working practitioners, not academicians. The goal is to get you well versed in applying techniques to solve real world problems in the most efficient manner.
Enhanced Funding Support for Professionals aged 40 and above and SMEs
Professionals aged 40 and above (i.e. self-sponsored individuals) and SMEs who are sponsoring their employees for training (i.e. organisation-sponsored trainees) will be entitled to CITREP enhanced funding support of up to 90% of the nett payable course and certification fees. This is applicable for Singapore Citizens and Permanent Residents (PR’s).
Please find FY17 CITREP+ funding support details as per following:
|Organisation- sponsored||Non SMEs||
course + examUp to 70% of the nett payable course and certification fees, capped at $3000 per trainee
exam onlyUp to 70% of the nett payable certification fees, capped at $500 per trainee
|Singapore Citizens and Permanent Residents (PR’s)|
|SMEs||Up to 90% of the nett payable course and certification fees, capped at $3000 per trainee||Up to 70% of the nett payable certification fees, capped at $500 per trainee|
|Self-Sponsored||Professionals (Citizens and PRs)||Up to 70% of the nett payable course and certification fees, capped at $3000 per trainee||Up to 70% of the nett payable certification fees, capped at $500 per trainee||Singapore Citizens and Permanent Residents (PR’s)|
|Professionals (Citizens 40 years old and above)* as of 1 Jan of the current year||Up to 90% of the nett payable course and certification fees, capped at $3000 per trainee||Up to 70% of the nett payable certification fees, capped at $500 per trainee|
|Students (Citizens) and/or Full-Time National Service (NSF)||Up to 100% of the nett payable course and certification fees, capped at $2500 per trainee||Up to 100% of the nett payable certification fees, capped at $500 per trainee|
Big Data/Data Science/ML Foundation Course
The Big Data/Data Science/ML Foundation course in Singapore teaches you the basic skills and expertise needed to dissect large volumes of data leading, detect patterns and enable intelligent decision-making. The foundation course is non-technical and is open to managers, professionals and decision makers.
One day 1, we cover the basics of Big data, on day 2, we cover data science aspects and on day 3, we cover machine learning basics.
Participants will get practical knowledge of Data Acquisition, Data Cleaning, Data Analysis, Data Visualization and Machine Learning. The course covers business, computer science and math and provides insights into successful data science projects.
This course has been designed from ground up to cater to people with no prior coding experience. Participants will be introduced into the world of Python programming through easy to grasp exercises. Our instructors will work with individuals one to one throughout the class to ensure each participant grasps the fundamentals.
After completion of the 3 days classroom training, participants can take up our online tutorials that covers advanced topics. No additional charges for the online tutorials . The online tutorials includes watching a video and completing an exercise. Instructors will then provide feedback on the completed exercises. Instructor support is available for 6 months after classroom training.
Big Data is a process to deliver decision-making insights. The process uses people and technology to quickly analyze large amounts of data of different types (traditional table structured data and unstructured data, such as pictures, video, email, transaction data, and social media interactions) from a variety of sources to produce a stream of actionable knowledge. Organizations increasingly need to analyze information to make decisions for achieving greater efficiency, profits, and productivity.
As relational databases have grown in size to satisfy these requirements, organizations have also looked at other technologies for storing vast amounts of information. These new systems are often referred to under the umbrella term “Big Data.” Gartner has identified three key characteristics for big data: Volume, Velocity, and Variety. Traditional structured systems are efficient at dealing with high volumes and velocity of data; however, traditional systems are not the most efficient solution for handling a variety of unstructured data sources or semi structured data sources.
Big Data solutions can enable the processing of many different types of formats beyond traditional transactional systems. Definitions for Volume, Velocity, and Variety vary, but most big data definitions are concerned with amounts of information that are too difficult for traditional systems to handle—either the volume is too much, the velocity is too fast, or the variety is too complex.
iKompass Big Data/Data Science/ML Course Sample Content
|Big Data/ Data Science Foundation course Funding is applicable to only Singapore Citizens and Permanent Residents (PR’s)|
|SGD||Self Sponsored Below 40||Self Sponsored Above 40||Non SME Company Employee
|Non SME Company Employee
|SME Company Employee
Above or Below 40
|CITREP claim back||2303.7||2500||2303.7||2500||2500||2000|
|Total to pay iKompass||2986.37||2986.37||3521.37||3521.37||3521.37||2087.57|
|Nett including GST*||682.67||486.37||1217.67||1021.37||1021.37||0|
|*Nett Investment is after funding. Full amount needs to be paid to training provider. Participant will claim funding after course completion|
Big Data/Data Science/ML Foundation
3 days Classroom Training
Our Big Data/ Data Science/ML Foundation course is a good place to start in case you do not have any experience with Big Data or data science. It provides information on the best practices in devising a Big Data/data science/ML solution for your organization. The course teaches you the basic skills and expertise needed to dissect large volumes of data leading to intelligent decision-making.
- 3 days classroom training
- Business and manager focused
- 6 months of online learning with weekly assignments and feedback
- Post course Video tutorials with support
Classroom Training Outline
Big Data/ Data Science Foundation Course Outline
|9:00 - 10:00||Machine Learning Lifecycle||Theory||Training and testing data. The machine learning life cycle is the cyclical process that data science projects follow. It defines each step that an organization needs to take in order to take advantage of machine learning and artificial intelligence (AI) to derive practical business value.|
|10:00 - 10:30||BI Versus Data Science||Theory||Business intelligence is the use of data to help make business decisions. Data analytics is a data science. If business intelligence is the decision making phase, then data analytics is the process of asking questions.|
|10:30 - 10:45 Tea break|
|10:45 - 12:00||Big Data Characteristics||Theory||Volume, Velocity, Value. This section discusses characteristics that make for Big Data.|
|12:00 - 13:00 Lunch|
|13:00 - 14:00||Python Functional Programming||Practical||Lists, Dictionary, Strings, Tuples, Functions. Python is a general - purpose programming language that is becoming more and more popular for doing data science. Companies worldwide are using Python to harvest insights from their data and get a competitive edge.||Python|
|14:00 - 14:45||Data Science tools||Practical||With over 6 million users, the open source Anaconda Distribution is the fastest and easiest way to do Python and R data science and machine learning on Linux, Windows, and Mac OS X. It's the industry standard for developing, testing, and training on a single machine.||Anaconda|
|14:45 -15:00 Tea break|
|15:00 - 17:00||Python Data Structures||Practical||The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.||Jupyter Notebook|
|9:00 - 10:00||Big Data Engineering||Theory||Clusters||Hadoop|
|10:00 - 10:45||Distributed Databases||Theory||NoSQL. NoSQL encompasses a wide variety of different database technologies that were developed in response to the demands presented in building modern applications||MongoDB|
|10:30 - 10:45 Tea break|
|10:45 - 11:15||Distributed Processing||Theory||Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.||Spark|
|11:15 - 12:00||Data Lakes||Theory||A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. While a hierarchical data warehouse stores data in files or folders, a data lake uses a flat architecture to store data.||Hadoop or S3|
|12:00 - 13:00 Lunch|
|13:00 - 14:00||NumPy||Practical||Mathematical Operations on matrices|
|14:00 - 14:45||Data Acquisition||Practical||Data Collection. API stands for Application Programming Interface. An API is a software intermediary that allows two applications to talk to each other. In other words, an API is the messenger that delivers your request to the provider that you're requesting it from and then delivers the response back to you.||Beautiful Soup. API’s|
|14:00 - 14:45||Data Cleaning||Practical||Wrangling. Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.||Pandas|
|14:45 -15:00 Tea break|
|15:00 - 17:00||Data Visualization||Practical||Charts. Data visualization is a general term that describes any effort to help people understand the significance of data by placing it in a visual context. Patterns, trends and correlations that might go undetected in text-based data can be exposed and recognized easier with data visualization software.||Seaborn, Bokeh|
|9:00 - 10:45||Machine Learning Algorithms||Practical||Supervised. Scikit-learn (formerly scikits.learn) is a free software machine learning library for the Python programming language.It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means.||Scikit Learn|
|10:30 - 10:45 Tea break|
|10:45 - 11:00||Linear Regression||Practical||In statistics, linear regression is a linear approach to modelling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent variables).||Regressor|
|11:00 - 11:30||K Nearest Neighbors||Practical||In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space. The output depends on whether k-NN is used for classification or regression||Classifer|
|11:30 - 12:00||Naïve Bayes||Practical||In machine learning, naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features.||Classifer|
|12:00 - 13:00 Lunch|
|13:00 - 17:00||Data Science project||Practical||In this project, we are going to use a very simple Titanic passenger survival dataset to show you how to start and finish a simple data science project using Python and Pandas; from exploratory data analysis, to feature selection and feature engineering, to model building and evaluation.||Pandas|
Secondary Name Node