pszemraj/pegasus-x-large-book-summary
Get SparkNotes-esque summaries of arbitrary text! Due to the model size, it's recommended to try it out in Colab (linked above) as the API textbox may time out.
This model is a fine-tuned version of
google/pegasus-x-large
on the
kmfoda/booksum
dataset for approx eight epochs.
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
Epochs 1-4
TODO
Epochs 5 & 6
The following hyperparameters were used during training:
-
learning_rate: 6e-05
-
train_batch_size: 4
-
eval_batch_size: 1
-
seed: 42
-
distributed_type: multi-GPU
-
gradient_accumulation_steps: 32
-
total_train_batch_size: 128
-
optimizer:
ADAN
using lucidrains'
adan-pytorch
with default betas
-
lr_scheduler_type: constant_with_warmup
-
data type: TF32
-
num_epochs: 2
Epochs 7 & 8
-
epochs 5 & 6 were trained with 12288 tokens input
-
this fixes that with 2 epochs at 16384 tokens input
The following hyperparameters were used during training:
-
learning_rate: 0.0004
-
train_batch_size: 4
-
eval_batch_size: 1
-
seed: 42
-
distributed_type: multi-GPU
-
gradient_accumulation_steps: 16
-
total_train_batch_size: 64
-
optimizer:
ADAN
using lucidrains'
adan-pytorch
with default betas
-
lr_scheduler_type: cosine
-
lr_scheduler_warmup_ratio: 0.03
-
num_epochs: 2
Framework versions
-
Transformers 4.22.0
-
Pytorch 1.11.0a0+17540c5
-
Datasets 2.4.0
-
Tokenizers 0.12.1