Picture by Writer
Folks say you need to contemplate worth for cash when shopping for issues. Nonetheless, one of the best worth for cash is getting one thing good for free. However do such issues exist? Supposedly not, if we go by the saying, “No such factor as a free lunch.”
I declare there’s a free lunch, and I’m about to show it! I dug out 10 academic ‘free lunches’ – free information engineering programs that additionally present high quality data. It’s true; there’s far more selection and selection when you can or need to pay tens, a whole lot, typically even hundreds of {dollars}.
Many such programs are thought of free on another free course lists. Paying $90 one-off or $45/month is free to some folks. However many individuals don’t have that cash for a ‘free’ course, regardless of being very prepared to be taught information engineering. (Additionally, let’s get actual! Free actually means, effectively, free! Not ‘low cost’, not ‘little or no cash’, or ‘reasonably priced’. Free!)
From what I researched, these programs actually are free. Many are from edX. If you happen to select free entry to the course, you should full it in a sure time, often round six months. However that ought to be sufficient to finish each course comfortably. Additionally, free entry means you don’t get lifetime entry to all of the supplies (they’re deleted when you end) and don’t get a certificates. Regardless of this, you need to have the ability to use these programs to study information engineering.
Earlier than I discuss concerning the programs, let’s briefly overview the information engineer’s function. That means, realizing what to search for in programs might be simpler.
Understanding the Position of a Knowledge Engineer
Very merely, information engineers are in command of making information out there to information group members and different stakeholders. In doing so, they wrangle information and construct and keep information infrastructure, e.g., ETL course of, information pipelines, information storage.
Naturally, the programs ought to cowl all or a few of these expertise. Let’s take a better have a look at the programs – pun supposed – that can comprise your academic free lunch.
Free Knowledge Engineering Programs
1. Knowledge Engineering by ASU
Platform and hyperlink to the course: edX
Length: 5 weeks at 1-9 hours/week; be taught at your individual tempo
Description: This introductory-level course by Arizona State College focuses on working with databases in information engineering and find out how to work together with them utilizing SQL. You’ll study database construction, the star schema, and becoming a member of information from a number of tables. Within the remaining stage, you’ll learn to create stories with SQL and write scripts for information processing.
2. Python and Pandas for Knowledge Engineering by Pragmatic AI Labs
Platform and hyperlink to the course: edX
Length: 4 weeks at 3-6 hours/week; be taught at your individual tempo
Description: In one more introductory edX course, you’ll be taught Python and pandas for information engineering. The introduction to Python consists of subjects comparable to easy statements, if statements, whereas loops, and features. Then, you’ll study information manipulation in Pandas (significantly DataFrames) and its alternate options, comparable to NumPy, Spark, and PySpark. Within the final module, you’ll study Python improvement environments and model management.
3. Scripting with Python and SQL for Knowledge Engineering by Pragmatic AI Labs
Platform and hyperlink to the course: edX
Length: 4 weeks at 3-6 hours/week; be taught at your individual tempo
Description: If you wish to be taught SQL and Python for information engineering concurrently, that is the course for you. You’ll use Python’s built-in information buildings to govern information and write Python scripts for information process automation. The course additionally teaches you internet scraping and utilizing SQLite to retailer and question information in Python. Concerning SQL, you’ll learn to import and export information from MySQL database and find out how to execute MySQL queries in VSCode.
4. Cloud Knowledge Engineering by Pragmatic AI Labs
Platform and hyperlink to the course: edX
Length: 4 weeks at 3-6 hours/week; be taught at your individual tempo
Description: This course will train you information engineering within the cloud. You’ll study methodologies in information engineering, develop distributed methods, serverless information engineering methods, and cloud ETL pipelines, and study information governance. Within the course of, you’ll get in contact with applied sciences comparable to:
- CUDA
- Numba
- ASICs
- Colab Professional
- Colab API
- Google BigQuery
- AWS
- Databricks SQL
- Click on
- Python
- Rust
That is additionally an introductory course with no conditions wanted.
5. Constructing ETL and Knowledge Pipelines with Bash, Airflow and Kafka by IBM
Platform and hyperlink to the course: edX
Length: 5 weeks at 2-4 hours/week; be taught at your individual tempo
Description: This information engineering course focuses on constructing ETL and information pipelines. Throughout the course, you’ll be taught what ETL and ELT processes are, create ETL utilizing Bash shell scripts, use Apache Airflow to create batch information pipelines, and Apache Kafka for streaming information pipelines.
That is an introductory course to those subjects however requires expertise working with relational databases, SQL, and Bash shell scripting.
6. Knowledge Warehousing and BI Analytics by IBM
Platform and hyperlink to the course: edX
Length: 6 weeks at 2-3 hours/week; be taught at your individual tempo
Description: This intermediate course by IBM teaches you the necessities of information warehouses, information marts, and information lakes. You’ll learn to design, mannequin, and implement information warehouses. Extra particularly, you’ll use CUBEs, ROLLUPs, materialized views, and tables. You’ll additionally study information and dimensional modeling, information modeling with star and snowflake schemas, staging areas for information warehouses, information high quality, and populating an information warehouse with information. Within the third module, you’ll work on information warehouse analytics in Cognos Analytics.
The course requires expertise with SQL and relational databases.
7. Apache Spark for Knowledge Engineering and Machine Studying by IBM
Platform and hyperlink to the course: edX
Length: 3 weeks at 2-3 hours/week; be taught at your individual tempo
Description: Yet one more intermediate course. It focuses on educating Apache Spark. It’s an necessary instrument in information engineering, so that you’ll study Spark Structured Streaming, GraphFrames, ETL course of, and ML pipelines. As well as, you’ll be taught ML fundamentals, comparable to regression, classification, and clustering.
The course requires foundational Apache Spark data. It’s additionally prompt that you simply full the Huge Knowledge, Hadoop and Spark Fundamentals course by IBM.
8. DE Zoomcamp
Platform and hyperlink to the course: DataTalks.Membership
Length: 10 weeks; be taught at your individual tempo
Description: Lastly, a course from a special platform! This on-line boot camp will offer you complete information engineering data. It’ll train you containerization and infrastructure, workflow orchestration, information warehousing, analytics engineering, batch processing, and streaming. You’ll be launched to applied sciences comparable to Google Cloud Platform, Terraform, Docker, SQL, Mage, dbt, Apache Spark, and Apache Kafka.
The conditions for this bootcamp are the SQL fundamentals. Additionally, it’s preferable that you’ve got expertise with Python or, if not, another programming language.
9. DE Finish-to-Finish Initiatives
Platform and hyperlink to the course: DE Academy
Length: No data.
Description: It is a project-based venture by which you’ll learn to use AWS, Snowflake, Python,Kafka, Azure, Databricks, Airflow, and Tableau. You’ll analyze and rework information, migrate it, and streamline workflows.
10. Scala Programming for Knowledge Science
Platform and hyperlink to the course: Cognitive Class AI
Length: 20 hours; be taught at your individual tempo
Description: This studying path consists of three programs. The primary is Scala 101, which is able to train you the fundamentals of object-oriented programming, case objects & lessons, collections, and idiomatic Scala. Within the second course, Spark Overview for Scala Analytics, you can be launched to Apache Spark, RDDs, DataFrames for large-scale information science, and superior Spark subjects (e.g., Hive with Spark, Spark streaming). The third course is about Scala in information science, the place you’ll be taught fundamental statistics and information varieties, find out how to put together information, engineer options, match a mannequin, construct a pipeline, and carry out grid search.
Conclusion
No shock that it’s simpler when you’ve got cash – you get entry to extra programs which can be extra numerous. Yeah, it sucks not having cash! However this doesn’t imply you should say goodbye to your dream of touchdown an information engineer function.
It’s a lot tougher to seek out them, however there are nonetheless some good programs that may train you fundamental and extra superior information engineering. I discovered ten of them. Another free assets, comparable to blogs or YouTube movies, may also help you attain the required degree of data.
If you happen to’re industrious sufficient, devoted, and chronic, I’m positive you’ll be able to land an information engineering function free of charge.
Nate Rosidi is an information scientist and in product technique. He is additionally an adjunct professor educating analytics, and is the founding father of StrataScratch, a platform serving to information scientists put together for his or her interviews with actual interview questions from prime firms. Nate writes on the most recent tendencies within the profession market, provides interview recommendation, shares information science tasks, and covers every part SQL.