模型:

xlm-clm-enfr-1024

任务:

填充掩码

类库:

PyTorch TensorFlow Transformers

语言:

multilingual

其他:

xlm AutoTrain Compatible

预印本库:

arxiv:1901.07291 arxiv:1910.09700

模型介绍文件清单

中文

xlm-clm-enfr-1024

Model Details

Uses

Bias, Risks, and Limitations

Training

Evaluation

Environmental Impact

Technical Specifications

Citation

Model Card Authors

How To Get Started With the Model

Model Details

The XLM model was proposed in Cross-lingual Language Model Pretraining by Guillaume Lample, Alexis Conneau. xlm-clm-enfr-1024 is a transformer pretrained using a causal language modeling (CLM) objective (next token prediction) for English-French.

Model Description

Developed by: Guillaume Lample, Alexis Conneau, see associated paper
Model type: Language model
Language(s) (NLP): English-French
License: Unknown
Related Models: xlm-clm-ende-1024 , xlm-mlm-ende-1024 , xlm-mlm-enfr-1024 , xlm-mlm-enro-1024
Resources for more information:

Uses

Direct Use

The model is a language model. The model can be used for causal language modeling (next token prediction).

Downstream Use

To learn more about this task and potential downstream uses, see the Hugging Face Multilingual Models for Inference docs.

Out-of-Scope Use

The model should not be used to intentionally create hostile or alienating environments for people.

Bias, Risks, and Limitations

Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021) ).

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Training

See the associated paper for details on the training data and training procedure.

Evaluation

Testing Data, Factors & Metrics

See the associated paper for details on the testing data, factors and metrics.

Results

For xlm-clm-enfr-1024 results, see Table 2 of the associated paper .

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019) .

Hardware Type: More information needed
Hours used: More information needed
Cloud Provider: More information needed
Compute Region: More information needed
Carbon Emitted: More information needed

Technical Specifications

The model developers write:

We implement all our models in PyTorch (Paszke et al., 2017), and train them on 64 Volta GPUs for the language modeling tasks, and 8 GPUs for the MT tasks. We use float16 operations to speed up training and to reduce the memory usage of our models.

See the associated paper for further details.

Citation

BibTeX:

@article{lample2019cross,
  title={Cross-lingual language model pretraining},
  author={Lample, Guillaume and Conneau, Alexis},
  journal={arXiv preprint arXiv:1901.07291},
  year={2019}
}

APA:

Lample, G., & Conneau, A. (2019). Cross-lingual language model pretraining. arXiv preprint arXiv:1901.07291.

Model Card Authors

This model card was written by the team at Hugging Face.

How to Get Started with the Model

Use the code below to get started with the model.

Click to expand

import torch
from transformers import XLMTokenizer, XLMWithLMHeadModel

tokenizer = XLMTokenizer.from_pretrained("xlm-clm-enfr-1024")
model = XLMWithLMHeadModel.from_pretrained("xlm-clm-enfr-1024")

input_ids = torch.tensor([tokenizer.encode("Wikipedia was used to")])  # batch size of 1

language_id = tokenizer.lang2id["en"]  # 0
langs = torch.tensor([language_id] * input_ids.shape[1])  # torch.tensor([0, 0, 0, ..., 0])

# We reshape it to be of size (batch_size, sequence_length)
langs = langs.view(1, -1)  # is now of shape [1, sequence_length] (we have a batch size of 1)

outputs = model(input_ids, langs=langs)

作者:

None

数据集大小:

1.55 GB

xlm-clm-enfr-1024

Table of Contents

Model Details

Model Description

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

Training

Evaluation

Testing Data, Factors & Metrics

Results

Environmental Impact

Technical Specifications

Citation

Model Card Authors

How to Get Started with the Model