bigbird pegasus on the booksum dataset
 
 
  this is the "latest" version of the model that has been trained the longest, currently at 70k steps
 
 
  - 
   GOAL:
   A summarization model that 1) summarizes the source content accurately 2)
   more important IMO
   produces summaries that are easy to read and understand (* cough * unlike arXiv * cough *)
   
    - 
     This model attempts to help with that by using the
     
      booksum
     
     dataset to provide
     explanatory summarization
    
 
    - 
     Explanatory Summary - A summary that both consolidates information and also explains why said consolidated information is important.
    
 
   
   
  - 
   This model was trained for seven epochs total (approx 70,000 steps) and is closer to finished.
   
    - 
     Will continue to improve  (slowly, now that it has been trained for a long time) based on any result findings/feedback.
    
 
   
   
  - 
   starting checkpoint was
   google/bigbird-pegasus-large-bigpatent
  
 
 
 
  example usage
 
 
  An extended example, including a demo of batch summarization, is
  
   here
  
  .
 
 
  - 
   create the summarizer object:
  
 
 
 from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
from transformers import pipeline
model = AutoModelForSeq2SeqLM.from_pretrained(
    "pszemraj/bigbird-pegasus-large-K-booksum",
    low_cpu_mem_usage=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "pszemraj/bigbird-pegasus-large-K-booksum",
)
summarizer = pipeline(
    "summarization",
    model=model,
    tokenizer=tokenizer,
)          
 
  - 
   define text to be summarized, and pass it through the pipeline. Boom done.
  
 
 
 wall_of_text = "your text to be summarized goes here."
result = summarizer(
    wall_of_text,
    min_length=16,
    max_length=256,
    no_repeat_ngram_size=3,
    clean_up_tokenization_spaces=True,
)
print(result[0]["summary_text"])
 
  Alternate Checkpoint
 
 
  - 
   if experiencing runtime/memory issues, try
   
    this earlier checkpoint
   
   at 40,000 steps which is almost as good at the explanatory summarization task but runs faster.
  
 
  - 
   see similar summarization models fine-tuned on booksum but using different architectures:
   
    long-t5 base
   
   and
   
    LED-Large