I’ve been step by step bettering my data wrangling tool, Easy Data Transform, placing out 70 public releases since 2019. Whereas the product’s emphasis is on ease of use, fairly than pure efficiency, I’ve been attempting to make it quick as nicely, so it might probably address the multi-million row datasets clients prefer to throw at it. To see how I used to be doing, I did a easy benchmark of the newest model of Simple Information Rework (v1.37.0) towards a number of different desktop information wrangling instruments. The benchmark did a learn, type, be a part of and write of a 1 million row CSV file. I did the benchmarking on my Home windows improvement PC and my Mac M1 laptop computer.
Right here is an outline of the outcomes:
Time by job (seconds), on Home windows with out Energy Question (smaller is healthier):
I’ve left Excel Energy Question off this graph, as it’s so sluggish you’ll be able to hardly see the opposite bars when it’s included!
Time by job (seconds) on Mac (smaller is healthier):
Reminiscence utilization (MB), Home windows vs Mac (smaller is healthier):
So Simple Information Rework is sort of as quick because it’s nearest competitor, Knime, on Home windows and a good bit quicker on an M1 Mac. It’s also makes use of quite a bit much less reminiscence than Knime. Nonetheless we’ve got obtained some method to go to meet up with the Pandas library for Python and the info.desk package deal for R, relating to uncooked efficiency. Hopefully I can get nearer to their efficiency in time. I used to be forbidden from together with benchmarks for Tableau Prep and Alteryx by their licensing phrases, which appears unnecessarily restrictive.
Taking a look at simply the Simple Information Rework outcomes, it’s attention-grabbing to note {that a} newish Macbook Air M1 laptop computer is considerably quicker than a desktop AMD Ryzen 7 desktop PC from a number of years in the past.
See the total comparability:
Received some information to scrub, merge, reshape or analyze? Why not download a free trial of Easy Data Transform ? No join required.