pszemraj/pegasus-x-large-book-summary

Get SparkNotes-esque summaries of arbitrary text! Due to the model size, it's recommended to try it out in Colab (linked above) as the API textbox may time out.

This model is a fine-tuned version of google/pegasus-x-large on the kmfoda/booksum dataset for approx eight epochs.

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

Epochs 1-4

TODO

Epochs 5 & 6

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 4
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 32
total_train_batch_size: 128
optimizer: ADAN using lucidrains' adan-pytorch with default betas
lr_scheduler_type: constant_with_warmup
data type: TF32
num_epochs: 2

Epochs 7 & 8

epochs 5 & 6 were trained with 12288 tokens input
this fixes that with 2 epochs at 16384 tokens input

The following hyperparameters were used during training:

learning_rate: 0.0004
train_batch_size: 4
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 16
total_train_batch_size: 64
optimizer: ADAN using lucidrains' adan-pytorch with default betas
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.03
num_epochs: 2

Framework versions

Transformers 4.22.0
Pytorch 1.11.0a0+17540c5
Datasets 2.4.0
Tokenizers 0.12.1

作者:

Peter Szemraj

数据集大小:

4.25 GB