模型:
pszemraj/long-t5-tglobal-base-sci-simplify-elife
Exploring how well long-document models trained on "lay summaries" of scientific papers generalize.
A lay summary is a summary of a research paper or scientific study that is written in plain language, without the use of technical jargon, and is designed to be easily understood by non-experts.
This model is a fine-tuned version of google/long-t5-tglobal-base on the pszemraj/scientific_lay_summarisation-elife-norm dataset.
It's recommended to usage this model with beam search decoding . If interested, you can also use the textsum util repo to have most of this abstracted out for you:
pip install -U textsum
from textsum.summarize import Summarizer model_name = "pszemraj/long-t5-tglobal-base-sci-simplify-elife" summarizer = Summarizer(model_name) # GPU auto-detected text = "put the text you don't want to read here" summary = summarizer.summarize_string(text) print(summary)
The elife subset of the :lay summaries dataset. Refer to pszemraj/scientific_lay_summarisation-elife-norm
It achieves the following results on the evaluation set:
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
2.2995 | 1.47 | 100 | 2.0175 | 35.2501 | 8.2121 | 20.4587 | 32.4494 | 439.7552 |
2.2171 | 2.94 | 200 | 1.9990 | 38.5587 | 9.7336 | 21.1974 | 35.9333 | 392.7095 |