Explore the fundamentals of simple statistical analysis in data analytics, covering classification of data with the aid of graphs.
If your OTT recommendation engine is not doing a good job of suggesting a movie for you, its time to take things to your own hand!
lets go nerdy and analyse some data to identify a movie to watch over the weekend. As with any quantitative analysis, let's deduce things one by one and arrive at a few options to watch.
In this tutorial we'll look at the IMDB movie data from 1910 - 2024 and follow the below steps to identify a few movie title options.
With this Analysis, we are trying to understand how the spends are going to increase and what should be our personal earnings growth to survive.
Please use the attached sample data set to follow along with this Tutorial
Fig. 1 : Snapshot of the Data in .CSV file
Step 1:
Login in to your Free Talktodata.AI account and upload the data set. Below is the screenshot for reference.
Fig.2 : Steps to Login and Upload the sample data file
Step 2:
Identifying the data outliers and removing them from our analysis. It is important to remove this outlier data to ensure the outputs are not over weighed by best and the worst ratings.
For this, i'm using the command:
"can you identify the outliers from the dataset and exclude them from further analysis?"
We identified and excluded the outliers from the dataset.
Fig. 3: Graph Output
Step 3:
Now lets further analyse the data and short list few genres from the past two decades:
First I'm using the prompt : "Can you give me a visualisation of genre wise average ratings for the past two decades using the cleaned dataset?"
Fig. 4: Graph Output
Fig. 5: Genres with More than 10 Movies
Step 4:
Now for the top genre, lets see the top movies, their directors and their average ratings
First I'm using the prompt : "can you give me highest rated movies from the top 5 genres where the director has more than 3 movies?"
By using this prompt, we are again ensuring that the averages work in our favor.
Now that we have this list, lets compare the IMDB scores and Metascores to arrive at a conclusion.
For this, i'm using the command : "can you plot the IMDB score and metascore for these movies?"
Looks like we have picked up Harry potter for this weekend!
This is one example of how you can analyse your data while ensuring that the outliers are taken care of. Also your sub-classes of data have enough inputs to qualify as sound data
But this is not the only way to do this. Here are a few thought starters for you to try :
To ensure accurate results, it's important for us to ask the right questions.
Here is a simple guide on how to ask the right questions
You can further enhance this visualisation by asking the assistant to add data tags, change graph colours, add an average line etc. Now go-ahead and start your AI Assisted Data Analytics Journey.
Click here to Download the sample data set.
For any queries or support, please visit https://talktodata.ai/