CShorten/ML-ArXiv-Papers | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

数据集:

CShorten/ML-ArXiv-Papers

许可:

afl-3.0

数据集介绍文件清单

中文

This dataset contains the subset of ArXiv papers with the "cs.LG" tag to indicate the paper is about Machine Learning.

The core dataset is filtered from the full ArXiv dataset hosted on Kaggle: https://www.kaggle.com/datasets/Cornell-University/arxiv . The original dataset contains roughly 2 million papers. This dataset contains roughly 100,000 papers following the category filtering.

The dataset is maintained by with requests to the ArXiv API.

The current iteration of the dataset only contains the title and abstract of the paper.

The ArXiv dataset contains additional features that we may look to include in future releases. We have highlighted the top two features on the roadmap for integration:

authors
update_date
Submitter
Comments
Journal-ref
doi
report-no
categories
license
versions
authors_parsed

作者:

CShorten

数据集大小:

140.15 MB