模型:
pszemraj/long-t5-tglobal-base-sci-simplify
Exploring how well long-document models trained on "lay summaries" of scientific papers generalize.
A lay summary is a summary of a research paper or scientific study that is written in plain language, without the use of technical jargon, and is designed to be easily understood by non-experts.
This model is a fine-tuned version of google/long-t5-tglobal-base on the pszemraj/scientific_lay_summarisation-plos-norm dataset for two epochs.
It's recommended to use this model with beam search decoding . If you are interested, you can also use the textsum util repo to have most of this abstracted for you:
Install with pip :
pip install -U textsum
Use in python:
from textsum.summarize import Summarizer summarizer = Summarizer('pszemraj/long-t5-tglobal-base-sci-simplify') text = "put the text you don't want to read here" summary = summarizer.summarize_string(text) print(summary)
It achieves the following results on the evaluation set:
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.966 | 0.52 | 200 | 1.7171 | 48.6521 | 18.427 | 26.7726 | 44.3947 | 376.335 |
1.877 | 1.03 | 400 | 1.6909 | 49.3263 | 18.7945 | 27.0741 | 45.1737 | 382.205 |
1.9007 | 1.55 | 600 | 1.6778 | 49.1475 | 18.9281 | 26.9893 | 45.0973 | 399.4125 |