Leveraging pre-training on large-scale data to reduce the computational cost of developing downstream task models.