September 15, 2024

I’ve been step by step bettering my data wrangling tool, Easy Data Transform, placing out 70 public releases since 2019. Whereas the product’s emphasis is on ease of use, fairly than pure efficiency, I’ve been attempting to make it quick as nicely, so it might probably address the multi-million row datasets clients prefer to throw at it. To see how I used to be doing, I did a easy benchmark of the newest model of Simple Information Rework (v1.37.0) towards a number of different desktop information wrangling instruments. The benchmark did a learn, type, be a part of and write of a 1 million row CSV file. I did the benchmarking on my Home windows improvement PC and my Mac M1 laptop computer.

Easy Data Transform screenshot

Right here is an outline of the outcomes:

Time by job (seconds), on Home windows with out Energy Question (smaller is healthier):

data wrangling/ETL benchmark Windows

I’ve left Excel Energy Question off this graph, as it’s so sluggish you’ll be able to hardly see the opposite bars when it’s included!

Time by job (seconds) on Mac (smaller is healthier):

data wrangling/ETL benchmark M1 Mac

Reminiscence utilization (MB), Home windows vs Mac (smaller is healthier):

data wrangling/ETL benchmark memory Windows vs Mac

So Simple Information Rework is sort of as quick because it’s nearest competitor, Knime, on Home windows and a good bit quicker on an M1 Mac. It’s also makes use of quite a bit much less reminiscence than Knime. Nonetheless we’ve got obtained some method to go to meet up with the Pandas library for Python and the info.desk package deal for R, relating to uncooked efficiency. Hopefully I can get nearer to their efficiency in time. I used to be forbidden from together with benchmarks for Tableau Prep and Alteryx by their licensing phrases, which appears unnecessarily restrictive.

Taking a look at simply the Simple Information Rework outcomes, it’s attention-grabbing to note {that a} newish Macbook Air M1 laptop computer is considerably quicker than a desktop AMD Ryzen 7 desktop PC from a number of years in the past.

Windows vs Mac M1 benchmark

See the total comparability:

Comparison of data wrangling/ETL tools : R, Pandas, Knime, Power Query, Tableau Prep, Alteryx and Easy Data Transform, with benchmarks

Received some information to scrub, merge, reshape or analyze? Why not download a free trial of Easy Data Transform ? No join required.