模型:
KoboldAI/fairseq-dense-13B
This is a Hugging Face transformers-compatible conversion of the original dense 13B-parameter model from the paper " Efficient Large Scale Language Modeling with Mixtures of Experts " from Artetxe et al. Please refer to the original model card, which can be found at https://github.com/facebookresearch/fairseq/blob/main/examples/moe_lm/model_card.md .