中文

pszemraj/pegasus-x-large-book-summary

Get SparkNotes-esque summaries of arbitrary text! Due to the model size, it's recommended to try it out in Colab (linked above) as the API textbox may time out.

This model is a fine-tuned version of google/pegasus-x-large on the kmfoda/booksum dataset for approx eight epochs.

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

Epochs 1-4

TODO

Epochs 5 & 6

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: ADAN using lucidrains' adan-pytorch with default betas
  • lr_scheduler_type: constant_with_warmup
  • data type: TF32
  • num_epochs: 2
Epochs 7 & 8
  • epochs 5 & 6 were trained with 12288 tokens input
  • this fixes that with 2 epochs at 16384 tokens input

The following hyperparameters were used during training:

  • learning_rate: 0.0004
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: ADAN using lucidrains' adan-pytorch with default betas
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2

Framework versions

  • Transformers 4.22.0
  • Pytorch 1.11.0a0+17540c5
  • Datasets 2.4.0
  • Tokenizers 0.12.1