Round 13 of the Yelp dataset challenge started in January 2019 providing students the opportunity to win awards and conduct analysis or research for academic use.
I wrote an article Convert Yelp Dataset to CSV to demonstrate a step-by-step of how to load the gigantic file of the Yelp dataset, notably the 5.2 gigabytes and 6 million rows worth of review.json file to a more manageable CSV file. With over 6 million reviews in the review.json file, it could be troublesome to load inside a Jupyter Notebook. After successfully converting the dataset, check out my next post for explorative data analysis with visualization of the dataset!
The code used for this is located in my GitHub repository.