模型:
facebook/convnextv2-atto-1k-224
ConvNeXt V2 model pretrained using the FCMAE framework and fine-tuned on the ImageNet-1K dataset at resolution 224x224. It was introduced in the paper ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders by Woo et al. and first released in this repository .
Disclaimer: The team releasing ConvNeXT V2 did not write a model card for this model so this model card has been written by the Hugging Face team.
ConvNeXt V2 is a pure convolutional model (ConvNet) that introduces a fully convolutional masked autoencoder framework (FCMAE) and a new Global Response Normalization (GRN) layer to ConvNeXt. ConvNeXt V2 significantly improves the performance of pure ConvNets on various recognition benchmarks.
You can use the raw model for image classification. See the model hub to look for fine-tuned versions on a task that interests you.
Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes:
from transformers import AutoImageProcessor, ConvNextV2ForImageClassification import torch from datasets import load_dataset dataset = load_dataset("huggingface/cats-image") image = dataset["test"]["image"][0] preprocessor = AutoImageProcessor.from_pretrained("facebook/convnextv2-atto-1k-224") model = ConvNextV2ForImageClassification.from_pretrained("facebook/convnextv2-atto-1k-224") inputs = preprocessor(image, return_tensors="pt") with torch.no_grad(): logits = model(**inputs).logits # model predicts one of the 1000 ImageNet classes predicted_label = logits.argmax(-1).item() print(model.config.id2label[predicted_label]),
For more code examples, we refer to the documentation .
@article{DBLP:journals/corr/abs-2301-00808, author = {Sanghyun Woo and Shoubhik Debnath and Ronghang Hu and Xinlei Chen and Zhuang Liu and In So Kweon and Saining Xie}, title = {ConvNeXt {V2:} Co-designing and Scaling ConvNets with Masked Autoencoders}, journal = {CoRR}, volume = {abs/2301.00808}, year = {2023}, url = {https://doi.org/10.48550/arXiv.2301.00808}, doi = {10.48550/arXiv.2301.00808}, eprinttype = {arXiv}, eprint = {2301.00808}, timestamp = {Tue, 10 Jan 2023 15:10:12 +0100}, biburl = {https://dblp.org/rec/journals/corr/abs-2301-00808.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }