pszemraj/pegasus-x-large-book-summary
 
 
  
 
 
  Get SparkNotes-esque summaries of arbitrary text! Due to the model size, it's recommended to try it out in Colab (linked above) as the API textbox may time out.
 
 
  This model is a fine-tuned version of
  
   google/pegasus-x-large
  
  on the
  kmfoda/booksum
  dataset for approx eight epochs.
 
 
  Training and evaluation data
 
 
  More information needed
 
 
  Training procedure
 
 
  Training hyperparameters
 
 Epochs 1-4
 
  TODO
 
 Epochs 5 & 6
 
  The following hyperparameters were used during training:
 
 
  - 
   learning_rate: 6e-05
  
 
  - 
   train_batch_size: 4
  
 
  - 
   eval_batch_size: 1
  
 
  - 
   seed: 42
  
 
  - 
   distributed_type: multi-GPU
  
 
  - 
   gradient_accumulation_steps: 32
  
 
  - 
   total_train_batch_size: 128
  
 
  - 
   optimizer:
   ADAN
   using lucidrains'
   adan-pytorch
   with default betas
  
 
  - 
   lr_scheduler_type: constant_with_warmup
  
 
  - 
   data type: TF32
  
 
  - 
   num_epochs: 2
  
 
 
 Epochs 7 & 8
 
  - 
   epochs 5 & 6 were trained with 12288 tokens input
  
 
  - 
   this fixes that with 2 epochs at 16384 tokens input
  
 
 
 
  The following hyperparameters were used during training:
 
 
  - 
   learning_rate: 0.0004
  
 
  - 
   train_batch_size: 4
  
 
  - 
   eval_batch_size: 1
  
 
  - 
   seed: 42
  
 
  - 
   distributed_type: multi-GPU
  
 
  - 
   gradient_accumulation_steps: 16
  
 
  - 
   total_train_batch_size: 64
  
 
  - 
   optimizer:
   ADAN
   using lucidrains'
   adan-pytorch
   with default betas
  
 
  - 
   lr_scheduler_type: cosine
  
 
  - 
   lr_scheduler_warmup_ratio: 0.03
  
 
  - 
   num_epochs: 2
  
 
 
 
  Framework versions
 
 
  - 
   Transformers 4.22.0
  
 
  - 
   Pytorch 1.11.0a0+17540c5
  
 
  - 
   Datasets 2.4.0
  
 
  - 
   Tokenizers 0.12.1