Skip to content

Scaling PANDAS and PYTHON

Do these Pandas Alternatives actually work? In this video we benchmark some of the python pandas alternative libraries and benchmark their speed on a large dataset. We look at four different libraries: Dask, Modin, Ray and Vaex. Pandas is a very popular library used by data scientists who code in python and other libraries exist that claim to be faster than pandas. We put them to the test and see which is the fastest!

The BEST library for building Data Pipelines... Building data pipelines with #python is an important skill for data engineers and data scientists. But what's the best library to use? In this video we look at three options: pandas, polars, and spark (pyspark).

Scaling Pandas: Comparing Dask, Ray, Modin, Vaex, and RAPIDS

modin

Alejandro Herrera - Supercharging your pandas workflows with Modin | PyData Global 2022

image image