Using bq load Command to Load logicalType Partitioned Data into a BigQuery Table

Following is the syntax and bq load command that you need to issue if you want to load data in avro file into a partitioned BigQuery table based on avro field defined as a logicalType.

Given the following schema

{
  "type" : "record",
  "name" : "logicalType",
  "namespace" : "com.ryanchapin.tests",
  "fields" : [ {
    "name" : "id",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "value",
    "type" : [ "null", "long" ],
    "default" : null
  }, {
    "name" : "day",
    "type" : {
      "type" : "int",
      "logicalType" : "date"
    }
  }
}

And the following BigQuery schema

[
  {
    "name": "id",
    "mode": "NULLABLE",
    "type": "STRING"
  },
  {
    "name": "value",
    "mode": "NULLABLE",
    "type": "INT64"
  },
  {
    "name": "day",
    "mode": "REQUIRED",
    "type": "DATE"
  }
]

Assuming that you have a correct avro data file (an exercise for the reader) that contains records that include values in the day column that are the number of days since the epoch, you can run the following bq load command to load that data into your table.

 bq --project_id my_project load --source_format=AVRO --time_partitioning_type=DAY --time_partitioning_field=day --use_avro_logical_types my_dataset.my_table gs://my_bucket/*.avro

Leave a Reply