Objective of the project
By group of 3 and for 3 weeks, we had to build a project that studies large data of accidentology, which corrects the abbot datas, then predits clusters of data from scratch and according to predictive models of articial intelligence, in order to conclude in 3rd week by the realization of a website with HTML5, CSS, Javascript, PHP and MySQL to present our work.

A Big Data part
The main goal of this very first part of the projet was to use R langage to repent and delete/modify the outliers or missing values of the exel file provided by the French government for the year 2009.
Then we have to use Rstudio to visualize graphics that show us the main trends of accentology this year ( location, groups of drivers, age, security systems etc).
Using these graphs, we could establish equations of lines see the relationships between quantitative variables and make first maps.
At the end of the week we had to prepare a clean and functional excel file for the IA part. This data is processed by insurance companies to set their rates according to parameters linked to the driver’s risk of having an accident.
An Artificial Intelligence prediction
The second step of this project was to use our AI knowledge to produce different Python scripts for our website.
The cleaned data by Big Data process are used in PCA reduction, in clustering from scratch with differents types of calculates disatnces (L1, L2, Haversine). Some KNN clasification were also used with and without Sklearn. “ROC” curves were realising to show accuracy of our simulation models.Then, some ranked algorithms were used with metrics like the Silhouette Coefficient, Calinski-Harabasz Index, and the Davies-Bouldin Index
The Web Development part
The end of this project was to show the work that was done in the previous 2 weeks. We had to use Javascript, PHP and HTML5/CSS in our web development. We used PHP for backend operations.
A data base was created on MySQL to enter new accidents, to see them on cluster maps and the severity according to the parameters of the latter :




