Dataset Card for "scan"
 
 
  Dataset Summary
 
 
  SCAN tasks with various splits.
 
 
  SCAN is a set of simple language-driven navigation tasks for studying
compositional learning and zero-shot generalization.
 
 
  See
  
   https://github.com/brendenlake/SCAN
  
  for a description of the splits.
 
 
  Example usage:
data = datasets.load_dataset('scan/length')
 
 
  Supported Tasks and Leaderboards
 
 
  
   More Information Needed
  
 
 
  Languages
 
 
  
   More Information Needed
  
 
 
  Dataset Structure
 
 
  Data Instances
 
 addprim_jump
 
  - 
   Size of downloaded dataset files:
   18.69 MB
  
 
  - 
   Size of the generated dataset:
   4.05 MB
  
 
  - 
   Total amount of disk used:
   22.73 MB
  
 
 
 
  An example of 'train' looks as follows.
 
 addprim_turn_left
 
  - 
   Size of downloaded dataset files:
   18.69 MB
  
 
  - 
   Size of the generated dataset:
   4.09 MB
  
 
  - 
   Total amount of disk used:
   22.76 MB
  
 
 
 
  An example of 'train' looks as follows.
 
 filler_num0
 
  - 
   Size of downloaded dataset files:
   18.69 MB
  
 
  - 
   Size of the generated dataset:
   2.85 MB
  
 
  - 
   Total amount of disk used:
   21.53 MB
  
 
 
 
  An example of 'train' looks as follows.
 
 filler_num1
 
  - 
   Size of downloaded dataset files:
   18.69 MB
  
 
  - 
   Size of the generated dataset:
   3.14 MB
  
 
  - 
   Total amount of disk used:
   21.82 MB
  
 
 
 
  An example of 'train' looks as follows.
 
 filler_num2
 
  - 
   Size of downloaded dataset files:
   18.69 MB
  
 
  - 
   Size of the generated dataset:
   3.44 MB
  
 
  - 
   Total amount of disk used:
   22.12 MB
  
 
 
 
  An example of 'train' looks as follows.
 
 
  Data Fields
 
 
  The data fields are the same among all splits.
 
 addprim_jump
 
  - 
   commands
   : a
   string
   feature.
  
 
  - 
   actions
   : a
   string
   feature.
  
 
 
 addprim_turn_left
 
  - 
   commands
   : a
   string
   feature.
  
 
  - 
   actions
   : a
   string
   feature.
  
 
 
 filler_num0
 
  - 
   commands
   : a
   string
   feature.
  
 
  - 
   actions
   : a
   string
   feature.
  
 
 
 filler_num1
 
  - 
   commands
   : a
   string
   feature.
  
 
  - 
   actions
   : a
   string
   feature.
  
 
 
 filler_num2
 
  - 
   commands
   : a
   string
   feature.
  
 
  - 
   actions
   : a
   string
   feature.
  
 
 
 
  Data Splits
 
 
  
   
    | 
     name
     | 
    
     train
     | 
    
     test
     | 
   
  
  
   
    | 
     addprim_jump
     | 
    
     14670
     | 
    
     7706
     | 
   
   
    | 
     addprim_turn_left
     | 
    
     21890
     | 
    
     1208
     | 
   
   
    | 
     filler_num0
     | 
    
     15225
     | 
    
     1173
     | 
   
   
    | 
     filler_num1
     | 
    
     16290
     | 
    
     1173
     | 
   
   
    | 
     filler_num2
     | 
    
     17391
     | 
    
     1173
     | 
   
  
 
 
  Dataset Creation
 
 
  Curation Rationale
 
 
  
   More Information Needed
  
 
 
  Source Data
 
 Initial Data Collection and Normalization
 
  
   More Information Needed
  
 
 Who are the source language producers?
 
  
   More Information Needed
  
 
 
  Annotations
 
 Annotation process
 
  
   More Information Needed
  
 
 Who are the annotators?
 
  
   More Information Needed
  
 
 
  Personal and Sensitive Information
 
 
  
   More Information Needed
  
 
 
  Considerations for Using the Data
 
 
  Social Impact of Dataset
 
 
  
   More Information Needed
  
 
 
  Discussion of Biases
 
 
  
   More Information Needed
  
 
 
  Other Known Limitations
 
 
  
   More Information Needed
  
 
 
  Additional Information
 
 
  Dataset Curators
 
 
  
   More Information Needed
  
 
 
  Licensing Information
 
 
  
   More Information Needed
  
 
 
  Citation Information
 
 @inproceedings{Lake2018GeneralizationWS,
  title={Generalization without Systematicity: On the Compositional Skills of
         Sequence-to-Sequence Recurrent Networks},
  author={Brenden M. Lake and Marco Baroni},
  booktitle={ICML},
  year={2018},
  url={https://arxiv.org/pdf/1711.00350.pdf},
}
 
  Contributions
 
 
  Thanks to
  
   @lewtun
  
  ,
  
   @patrickvonplaten
  
  ,
  
   @mariamabarham
  
  ,
  
   @thomwolf
  
  for adding this dataset.