A Certified Hadoop Developer Programme
林小姐,電話:2788 5800
 查詢    打印  


This 3-day hands-on programme delivers the key concepts and expertise in using Hadoop, Kafka to develop high-performance parallel applications. Python, R and Kafka Streaming will be used to perform real-time processing on streaming data from a variety of sources. Developers will also practise writing applications using core Hadoop to perform ETL processing and iterative algorithms.

The programme will cover how to work with “big data” stored in a distributed file system and execute applications on a Hadoop cluster. After this programme, participants will be prepared to face real-world challenges and build applications to execute faster and better decisions with interactive analysis that can be applied to a wide variety of cases, architectures, and industries.

Course Outline

Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics are highlighted below:

Day 1

  • HDFS Administration
  • Hadoop Resource Management
  • Apache Hadoop End-to-end encryption
  • Apache Hadoop Node Management
  • Big Data for Machine Learning / AI
  • Advanced Hadoop KPI

Day 2

  • Kafka Integration in Hadoop
  • Real-time Data Streaming with Analytics
  • Writing Hadoop Programs in Python
  • Read/Write Sequence Files Directly to HDFS from Python
  • Works on clusters with Python, or any Python libraries

Day 3

  • Using R for Data Mining
  • R Graphics (Base graphics, Lattice graphics)
  • Statistical Functions for Probability, Simulation and Data Analysis

**This programme is a preparatory course for Cloudera CCA Data Analyst / Developer Certification. Upon completion, attendees are encouraged to continue their studies and register for the Cloudera certifications

Award of Certificate of Attendance

Full Attendance will be awarded a Certificate of Attendance issued by the Hong Kong Productivity Council.


Basic knowledge on Linux and programming (such as C,C++,JAVA, Python or R)

Trainer Information

  • Patrick TSOI is an experienced trainer with hands-on data science, learning management, instructional design and Programming. He is a Doctor of Education from Hong Kong Baptist University, Master in IT Education from The University of Hong Kong and B.Eng in System Engineering and Engineering Management from The Chinese University of Hong Kong.
  • Simon MOK is an IT professional trainer for over 10 years covering IoT, data analytics, AI and machine learning and programing. He has rich experience in leading development team to deliver software solutions for clients. He is a M.Phil from The University of Hong Kong and MSc in Computer Science from The Chinese University of Hong Kong.


2, 9, 16 Dec 2020 (Wed)


09:30 – 17:00


Total 18 training hours


Cantonese, and supplemented with English for technical terms

Course Fee

HK$5,500 (May up to HK$3,666* subsidy)

* Maximum saving, with the final grant subjects to approval.
This course is an approved Reindustrialisation and Technology Training Programme (RTTP) with up to 2/3 course fee reimbursement upon successful applications. For details: https://rttp.vtc.edu.hk.


Download Full Course Detail ▼