数据集:

Bingsu/Human_Action_Recognition

语言:

en

大小:

10K<n<100K

源数据集:

original

许可:

odbl
中文

Dataset Summary

A dataset from kaggle . origin: https://dphi.tech/challenges/data-sprint-76-human-activity-recognition/233/data

Introduction

  • The dataset features 15 different classes of Human Activities.
  • The dataset contains about 12k+ labelled images including the validation images.
  • Each image has only one human activity category and are saved in separate folders of the labelled classes

PROBLEM STATEMENT

  • Human Action Recognition (HAR) aims to understand human behavior and assign a label to each action. It has a wide range of applications, and therefore has been attracting increasing attention in the field of computer vision. Human actions can be represented using various data modalities, such as RGB, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, radar, and WiFi signal, which encode different sources of useful yet distinct information and have various advantages depending on the application scenarios.
  • Consequently, lots of existing works have attempted to investigate different types of approaches for HAR using various modalities.
  • Your Task is to build an Image Classification Model using CNN that classifies to which class of activity a human is performing.

About Files

  • Train - contains all the images that are to be used for training your model. In this folder you will find 15 folders namely - 'calling', ’clapping’, ’cycling’, ’dancing’, ‘drinking’, ‘eating’, ‘fighting’, ‘hugging’, ‘laughing’, ‘listeningtomusic’, ‘running’, ‘sitting’, ‘sleeping’, texting’, ‘using_laptop’ which contain the images of the respective human activities.
  • Test - contains 5400 images of Human Activities. For these images you are required to make predictions as the respective class names -'calling', ’clapping’, ’cycling’, ’dancing’, ‘drinking’, ‘eating’, ‘fighting’, ‘hugging’, ‘laughing’, ‘listeningtomusic’, ‘running’, ‘sitting’, ‘sleeping’, texting’, ‘using_laptop’.
  • Testing_set.csv - this is the order of the predictions for each image that is to be submitted on the platform. Make sure the predictions you download are with their image’s filename in the same order as given in this file.
  • sample_submission: This is a csv file that contains the sample submission for the data sprint.

Data Fields

The data instances have the following fields:

  • image : A PIL.Image.Image object containing the image. Note that when accessing the image column: dataset[0]["image"] the image file is automatically decoded. Decoding of a large number of image files might take a significant amount of time. Thus it is important to first query the sample index before the "image" column, i.e. dataset[0]["image"] should always be preferred over dataset["image"][0] .
  • labels : an int classification label. All test data is labeled 0.

Class Label Mappings:

{
    'calling': 0,
    'clapping': 1,
    'cycling': 2,
    'dancing': 3,
    'drinking': 4,
    'eating': 5,
    'fighting': 6,
    'hugging': 7,
    'laughing': 8,
    'listening_to_music': 9,
    'running': 10,
    'sitting': 11,
    'sleeping': 12,
    'texting': 13,
    'using_laptop': 14
}

Data Splits

train test
# of examples 12600 5400

Data Size

  • download: 311.96 MiB
  • generated: 312.59 MiB
  • total: 624.55 MiB
>>> from datasets import load_dataset

>>> ds = load_dataset("Bingsu/Human_Action_Recognition")
>>> ds
DatasetDict({
    test: Dataset({
        features: ['image', 'labels'],
        num_rows: 5400
    })
    train: Dataset({
        features: ['image', 'labels'],
        num_rows: 12600
    })
})

>>> ds["train"].features
{'image': Image(decode=True, id=None),
 'labels': ClassLabel(num_classes=15, names=['calling', 'clapping', 'cycling', 'dancing', 'drinking', 'eating', 'fighting', 'hugging', 'laughing', 'listening_to_music', 'running', 'sitting', 'sleeping', 'texting', 'using_laptop'], id=None)}
 
>>> ds["train"][0]
{'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=240x160>,
 'labels': 11}