Research Article Open Access

A Machine Learning Approach to Predict Movie Revenue Based on Pre-Released Movie Metadata

Quazi Ishtiaque Mahmud1, Nuren Zabin Shuchi1, Fazle Mohammed Tawsif1, Asif Mohaimen1 and Ayesha Tasnim1
  • 1 Shahjalal University of Science and Technology, Bangladesh

Abstract

With the growth of the movie industry, it is becoming increasingly important for the stakeholders to get an idea about the probable profit made by the movie in the box office. In fact, among movies produced between 2000 and 2010 in the United States, only 36% had box office revenues higher than their production budgets, which further highlights the importance of making the right investment decisions. To address this issue, different machine learning algorithms like Logistic Regression, Support Vector Machine (SVM) and Multi Layer Perceptron (MLP) are used in this study to predict the box office return of a movie based on the data available before the release of the movie. The models use 35 movie parameters from 3200 movies as inputs to predict the profit made by a movie and classify the success of a movie from “flop” to “blockbuster” based on the generated revenue. An analysis of different machine learning architectures is also presented in this research. Finally, a system is proposed that produces comparable results with existing researches in this field and it can predict the profit generated by a movie with a “one class away” accuracy of 85.31% without using any sales information.

Journal of Computer Science
Volume 16 No. 6, 2020, 749-767

DOI: https://doi.org/10.3844/jcssp.2020.749.767

Submitted On: 17 March 2020 Published On: 10 June 2020

How to Cite: Mahmud, Q. I., Shuchi, N. Z., Tawsif, F. M., Mohaimen, A. & Tasnim, A. (2020). A Machine Learning Approach to Predict Movie Revenue Based on Pre-Released Movie Metadata. Journal of Computer Science, 16(6), 749-767. https://doi.org/10.3844/jcssp.2020.749.767

  • 4,905 Views
  • 2,120 Downloads
  • 3 Citations

Download

Keywords

  • Continuous-Valued Features
  • Binary Features
  • Logistic Regression
  • Support Vector Machine
  • Linear Kernel
  • KNN
  • Polynomial Kernel
  • RBF Kernel
  • Multi Layer Perceptron
  • Activation Functions