Data Science
Intro to the Page.
About This Course
Find out the truth about what Data Science is. Hear from real practitioners telling real stories about what it means to work in data science. This course was formerly named Data Science 101.
Course Syllabus
Module 1 - Defining Data Science
- What is data science?
- There are many paths to data science
- Any advice for a new data scientist?
- What is the cloud?
- "Data Science: The Sexiest Job in the 21st Century"
Module 2 - What do data science people do?
- A day in the life of a data science person
- R versus Python?
- Data science tools and technology
- "Regression"
Module 3 - Data Science in Business
- How should companies get started in data science?
- Tips for recruiting data science people
- "The Final Deliverable"
Module 4 - Use Cases for Data Science
- Applications for data science
- "The Report Structure"
Module 5 -Data Science People
- Things data science people say
- "What Makes Someone a Data Scientist?"
RECOMMENDED SKILLS PRIOR TO TAKING THIS COURSE
- None
Requirements
- None
About This Course
This introduction to Python willkickstart your learning of Python for data science, as well as programming in general. This beginner-friendly Python course will take you from zero to programming in Python in a matter of hours. Upon its completion, you'll be able to write your own Python scripts and perform basic hands-on data analysis using our Jupyter-based lab environment. If you want to learn Python from scratch, this free course is for you. You can start creating your own data science projects and collaborating with other data scientists using IBM Watson Studio. When you sign up, you get free access to Watson Studio. Start now and take advantage of this platform.
Course Syllabus
Module 1 - Python Basics
- Your first program
- Types
- Expressions and Variables
- String Operations
Module 2 - Regression
- Linear Regression
- Non-linear Regression
- Model evaluation methods
Module 3 - Classification
- K-Nearest Neighbour
- Decision Trees
- Logistic Regression
- Support Vector Machines
- Model Evaluation
Module 4 - Unsupervised Learning
- K-Means Clustering
- Hierarchical Clustering
- Density-Based Clustering
Module 5 - Recommender Systems
- Content-based recommender systems
- Collaborative Filtering
Prerequisites for this course
- Python for data science
Recommended skills prior to taking this course
- You are required to have-on lab for this course. The tool that you use for hands-on is called JupyterLab and it is one of the most popular tools used by data scientists. If you are not familiar with JupyterLab, it is recommended that you take our free Data Science Hands-on with Open Source Tools.
- This hands-on lab requires that you have working knowledge of Python programming language as it applies to data analytics. For attaining efficiency in Data Analysis with Python, it is recommended that you take Data Analysis with Python courses.
About This Course
Learn the basics of Apache Hadoop, a free, open source, Java-based programming framework. Why was it invented?
- Learn about Hadoop's architecture and core components, such as MapReduce and the Hadoop Distributed File System (HDFS).
- Learn how to add and remove nodes from Hadoop clusters, how to check available disk space on each node, and how to modify configuration parameters.
- Learn about other Apache projects that are part of the Hadoop ecosystem, including Pig, Hive, HBase, ZooKeeper, Oozie, Sqoop, Flume, among others.
Course Syllabus
Module 1 - Introduction to Hadoop
- Understand what Hadoop is
- Understand what Big Data is
- Learn about other open source software related to Hadoop
- Understand how Big Data solutions can work on the Cloud
Module 2 - Hadoop Architecture
- Understand the main Hadoop components
- Learn how HDFS works
- List data access patterns for which HDFS is designed
- Describe how data is stored in an HDFS cluster
Module 3 - Hadoop Administration
- Add and remove nodes from a cluster
- Verify the health of a clusterStart and stop a clusters components
- Modify Hadoop configuration parameters
- Setup a rack topology
Module 4 - Hadoop Components
- Describe the MapReduce philosophy
- Explain how Pig and Hive can be used in a Hadoop environment
- Describe how Flume and Sqoop can be used to move data into Hadoop
- Describe how Oozie is used to schedule and control Hadoop job execution
Recommended skills prior to taking this course
- Knowledge about Big Data Concepts
Requirements
- None
About this Data Visualization Course
"A picture is worth a thousand words". We are all familiar with this expression. It especially applies when trying to explain the insight obtained from the analysis of increasingly large datasets. Data visualization plays an essential role in the representation of both small and large-scale data. One of the key skills of a data scientist is the ability to tell a compelling story, visualizing data and findings in an approachable and stimulating way. Learning how to leverage a software tool to visualize data will also enable you to extract information, better understand the data, and make more effective decisions. The main goal of this Data Visualization with Python course is to teach you how to take data that at first glance has little meaning and present that data in a form that makes sense to people. Various techniques have been developed for presenting data visually but in this course, we will be using several data visualization libraries in Python, namely Matplotlib, Seaborn, and Folium. You can start creating your own data science projects and collaborating with other data scientists using IBM Watson Studio. When you sign up, you get free access to Watson Studio. Start now and take advantage of this platform.
Course Syllabus
Module 1 - Introduction to Visualization Tools
- Introduction to Data Visualization
- Introduction to Matplotlib
- Basic Plotting with Matplotlib
- Dataset on Immigration to Canada
- Line Plots
Module 2 - Basic Visualization Tools
- Area Plots
- Histograms
- Bar Charts
Module 3 - Specialized Visualization Tools
- Pie Charts
- Box Plots
- Scatter Plots
- Bubble Plots
Module 4 - Advanced Visualization Tools
- Waffle Charts
- Word Clouds
- Seaborn and Regression Plots
Module 5 - Creating Maps and Visualizing Geospatial Data
- Introduction to Folium
- Maps with Markers
- Choropleth Maps
Recommended skills prior to taking this course
- Knowledge about Big Data Concepts
Requirements
- Python 101
- Data Analysis with Python
About this Data Visualization Course
R is a powerful language for data analysis, data visualization, machine learning, statistics. Originally developed for statistical programming, it is now one of the most popular languages in data science. In this course, you'll be learning about the basics of R, and you'll end with the confidence to start writing your own R scripts. But this isn't your typical textbook introduction to R. You're not just learning about R fundamentals, you'll be using R to solve problems related to movies data. Using a concrete example makes the learning painless. You will learn about the fundamentals of R syntax, including assigning variables and doing simple operations with one of R's most important data structures -- vectors! From vectors, you'll then learn about lists, matrix, arrays and data frames. Then you'll jump into conditional statements, functions, classes and debugging. Once you've covered the basics - you'll learn about reading and writing data in R, whether it's a table format (CSV, Excel) or a text file (.txt). Finally, you'll end with some important functions for character strings and dates in R.
Course Syllabus
Module 1 - R basics
- Math, Variables, and Strings
- Vectors and Factors
- Vector operations
Module 2 - Data structures in R
- Arrays & Matrices
- Lists
Dataframes
Module 3 - R programming fundamentals
- Conditions and loops
- Functions in R
- Objects and Classes
- Debugging
Module 4 - Working with data in R
- Reading CSV and Excel Files
- Reading text files
- Writing and saving data objects to file in R
Module 5 - Strings and Dates in R
- String operations in R
- Regular Expressions
- Dates in R
Recommended skills prior to taking this course
- None
Requirements
- None
