Program  

Courses
Location
Corporate
Student Success
Resources
Bootcamp Programs
Short Courses
Portfolio Courses
Bootcamp Programs

Launch your career in Data and AI through our bootcamp programs

  • Industry-leading curriculum
  • Real portfolio/industry projects
  • Career support program
  • Both Full-time & Part-time options.
Data Science Bootcamp

Become a data engineer by learning how to build end-to-end data pipelines

 

Become a data analyst through building hands-on data/business use cases

Become an AI/ML engineer by getting specialized in deep learning, computer vision, NLP, and MLOps

Become a DevOps Engineer by learning AWS, Docker, Kubernetes, IaaS, IaC (Terraform), and CI/CD

Short Courses

Improve your data & AI skills through self-paced and instructor-led courses

  • Industry-leading curriculum
  • Portfolio projects
  • Part-time flexible schedule
AI ENGINEERING
Portfolio Courses

Learn to build impressive data/AI portfolio projects that get you hired

  • Portfolio project workshops
  • Work on real industry data & AI project
  • Job readiness assessment
  • Career support & job referrals

Build data strategies and solve ML challenges for real clients

Help real clients build BI dashboard and tell data stories

Build end to end data pipelines in the cloud for real clients

Location

Choose to learn at your comfort home or at one of our campuses

Corporate Partners

We’ve partnered with many companies on corporate upskilling, branding events, talent acquisition, as well as consulting services.

AI/Data Transformations with our customized and proven curriculum

Do you need expert help on data strategies and project implementations? 

Hire Data, AI, and Engineering talents from WeCloudData

Student Success

Meet our amazing alumni working in the Data industry

Read our students’ stories on how WeCloudData have transformed their career

Resources

Check out our events and blog posts to learn and connect with like-minded professionals working in the industry

Read blogs and updates from our community and alumni

Explore different Data Science career paths and how to get started

Our free courses and workshops gives you the skills and knowledge needed to transform your career in tech

Blog

Student Blog

Build Real-Time Dashboard on Amazon Webservices

October 21, 2020

The blog is posted by WeCloudData’s student Luis Vieira.

kibana dashboard on amazon webservices showing a map and data

  • Build a Real-time Stream Processing Pipeline with Apache Flink on Amazon Webservices;
  • Build and Run Streaming Applications with Amazon Kinesis and Firehose Consumer.

In today’s business environments, data is generated in a continuous fashion by a steadily increasing number of diverse data sources.

Architecture

A reliable and scalable streaming architecture based on Flink, Kinesis Data Streams, and Kinesis Data Firehose

  1. Service Level Agreement — Requirements
  • Consistency
  • Low latency
  • High availability
  • High throughput
  • Scale
  • Data durability
stream process pipeline example
Pipeline
process pipeline showing amazon emr
Elements
process pipeline elements
Implement Elements

2. Building elements

  • Create the streams to capture and store the taxi fleet records; – Add or remove shards to scale throughput;
  • Provision and manage cluster for your big data needs (EMR) – Apache Flink (Read events from kinesis, process performing transformation);
  • Setup Elasticsearch cluster to integrate with Kibana;
  • Lambda:

– Cloud Formation/Flink (Supports the stack and populates);

– Firehose (Transform records and load to Elasticsearch Service);

  • Set up Firehose to delivery stream data;
  • Inspect derived data with Kibana;

3. Let’s demo using Cloud Formation — Apache Flink

**Before you start, you will need to create an AWS Account.

3.1. CloudFormation-Flink: Building the runtime artifacts and creating the infrastructure

  • Execute the first CloudFormation template to create an AWS CodePipeline pipeline, which builds the artifacts by means of AWS CodeBuild in a serverless fashion: Execute HERE. (check your region*)
  • When the first template is created and the runtime artifacts are built, execute the second CloudFormation template, which creates the resources of the reference architecture described earlier: Execute HERE. (check your region*)

3.2. Set Flink on EMR: Enable Web Connection

  • *You will have this screen.
emr weblink connection

Step 1: Open an SSH Tunnel to the Amazon EMR Master Node;

Download PuTTY: latest release (0.74)

This page contains download links for the latest released version of PuTTY. Currently this is 0.74, released on…

www.chiark.greenend.org.uk

Step 2: Configure a proxy management tool;

https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-connect-master-node-proxy.html

  • Step 1: EMR-EC2
  • Step 2: Flink Interface

**You will have your region and flags/checkpoint accordingly to the one generated for your environment. The above was generated when I was creating this post.

3.3. Ingesting trip events into the Amazon Kinesis stream

**You will have your region and flags/checkpoint accordingly to the one generated for your environment. The above was generated when I was creating this post.

3.4 Exploring Kibana

3.5 Scaling

4. Let’s demo using Kinesis Firehose

Building the runtime artifacts and creating the infrastructure. (Steps)

4.1 Create Kinesis Stream Data.

4.2 Connect Kinesis consumers.

  • Delivery records with Firehose.

4.3 Create Firehose

  • Enable Lambda to transform source records (taxi fleet records);
  • Specify the index for ES (**Kibana index);
  • Select Elasticsearch Service destination;
  • Set bucket to backup;

4.4 Inspect derived data with Kibana

Exploring Kibana

5. References

New York City Taxi and Limousine Commission (TLC) Trip Record Data

Data of Trips taken by taxis and for-hire vehicles in New York City.


I hope this blog helped you clearing concepts regarding Real-Time Stream Data and Dashboard on AWS. Don’t forget to like this blog if you genuinely liked it.

Follow for more awesome blogs!

https://aws.amazon.com/blogs/big-data/

To find out more about the courses our students have taken to complete these projects and what you can learn from WeCloudData, click here to see the learning path. To read more posts from Luis, check out his Medium posts here.

SPEAK TO OUR ADVISOR
Join our programs and advance your career in Cloud EngineeringData EngineeringData Science

"*" indicates required fields

Name*
This field is for validation purposes and should be left unchanged.
Other blogs you might like
Consulting
Executive Summary WeCloudData is one of the fastest growing Data & AI training companies in the world. Since 2016,…
by Beam Data
October 19, 2021
Student Blog
The blog is posted by WeCloudData’s Data Engineering course student Rupal Bhatt.  Here is a Donut Chart prepared from…
by Student WeCloudData
January 8, 2020
Student Blog
[Student Project] Visualizing New York City Taxi Data This blog is created by WeCloudData’s Data Science Bootcamp alumni Yaoyu…
by Student WeCloudData
October 28, 2019
Previous
Next

Kick start your career transformation