Useful for Data Science/ML beginners
This dataset is helpful for ML and Data Science beginners. One can learn to visualize the data and also apply supervised learning algorithms.
The dataset contains 4 columns namely 'gre' , 'sop' , 'cgpa', 'admitted' and 401 rows.
gre - marks scored out of 340
sop - marks scored out of 5
cgpa - marks scored out 5
admitted = whether or not the person is eligible for Admission
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
The counts of arrests are derived from information transmitted from law enforcement agencies to the Division of Criminal Justice Services Computerized Criminal History database for fingerprintable offenses. Arrests shown involve individuals who were 18 or older when the crime was committed. Fingerprintable offenses (defined in Criminal Procedure Law §160.10) include any felony, a misdemeanor defined in the penal law, a misdemeanor defined outside the penal law which would constitute a felony if such a person had a previous judgment of conviction for a crime, or loitering for the purpose of engaging in prostitution as defined in subdivision two of Penal Law §240.37.
Below dataset contains three files:
adult_data.csv - train dataset
adult_test.csv - test dataset
adult_descr.csv - file with description of the data
In my kernel I start with looking what is presented in the dataset, what features are placed inside, what informations can be found and compared with eachothers, next I clean and prepare data into the form that is good for models I test. Starting with basic classification models, through hyperparameters tuning, ending on boosting algorithms I try to find best model, that is finally tested on the test dataset
"The Africa Power–Mining Database 2014 shows ongoing and forthcoming mining projects in Africa categorized by the type of mineral, ore grade, size of the project. The database draws on basic mining data from Infomine surveys, the United States Geological Survey, annual reports, technical reports, feasibility studies, investor presentations, sustainability reports on property-owner websites or filed in public domains, and mining websites (Mining Weekly, Mining Journal, Mbendi, Mining-technology, and Miningmx). Comprising 455 projects in 28 SSA countries with each project’s ore reserve value assessed at more than $250 million, the database collates publicly available and proprietary information. It also provides a panoramic view of projects operating in 2000–12 and anticipated demand in 2020. The analysis is presented over three timeframes: pre-2000, 2001–12, and 2020 (each containing the projects from the previous period except for those closing during that previous period)."
This dataset contains 45 indicators for 31 African cities, related to urbanization, solid waste management, water resources availability, water supply services, sanitation services, flood hazards and economic and institutional strength.
dataset tomate Agricultura Digital
Banco de dados contruido exemplificar a contrução de modelos de machine learning no livro de Agricultura Digital.
Se for utitlizar o banco de dados para qualquer outro fim, além do educacional, por favor, entrar em contato pelo email: firstname.lastname@example.org
Fonte de parte dos dados: Dias, F.O. 2020. Modelagem de tendências espaciais na seleção de linhagens de tomateiro resistentes à Phytophthora infestans (Mont.) de Bary. pp. 49. Dissertação de Mestrado da Universidade Federal de Viçosa, Departamento de Agronomia, Orientador: Carlos Nick.