Easier Shopify Analytics with pyShopify

Shopify has a robust analytics section built into the administrative console but I needed to integrate my Shopify data in a dashboard with other business and marketing data, in addition to Excel Power Query and Google Sheets. Shopify offers two APIs for pulling data – REST and GraphQL. However, Shopify removed the ability to pull analytics reports directly from the API, instead only offering raw order level data in nested JSON. The data for all orders must be pulled in it’s entirety and stored in an intermediate step. This does not work well for my many dashboards, reports and models.

This is where pyShopify comes in, using Microsoft SQL Server in a Docker container with a built in python application to pull in Shopify data and insert/update into relational tables. Python is used as the scripting language to pull data from the API and populate the database, but all necessary functionality is available via command line.  The full code and repository can be found here.

If you aren’t interested in the theory or any of the tech specs, continue to Making It Work for the few simple steps to get it running. 

Converting Semi-structured API Data into Relational Tables

The Shopify API returns semi-structured data in a single JSON object at the order level. So to perform an analysis on the past year of performance, the entire years worth of orders must be pulled. The size of the dataset alone makes working with it difficult, in addition to the semi-structured format. The goal is to convert the semi-structured JSON response from Shopify into relational tables in SQL Server.  Below shows an abbreviated sample of the structure of the Shopify API response and the relational tables created from that data. 

Source: Shopify Semi Structured API Data

{
  "app_id": 1966818,
  "billing_address": {
    "address1": "2259 Park Ct",
    "address2": "Apartment 5",
    "city": "Drayton Valley",
    "company": null,
    "country": "Canada",
    "first_name": "Christopher",
    "last_name": "Gorski",
    "phone": "(555)555-5555",
    "province": "Alberta",
    "zip": "T0E 0M0",
    "name": "Christopher Gorski",
    "province_code": "AB",
    "country_code": "CA",
    "latitude": "45.41634",
    "longitude": "-75.6868"
  },
  "browser_ip": "216.191.105.146",
  "buyer_accepts_marketing": false,
  "cancel_reason": "customer",
  "cancelled_at": null,
  "cart_token": "68778783ad298f1c80c3bafcddeea",
  "checkout_token": "bd5a8aa1ecd019dd3520ff791ee3a24c",
  "client_details": {
    "accept_language": "en-US,en;q=0.9",
    "browser_height": 1320,
    "browser_ip": "216.191.105.146",
    "browser_width": 1280,
    "session_hash": "9ad4d1f4e6a8977b9dd98eed1e477643",
    "user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36"
  },
  "closed_at": "2008-01-10T11:00:00-05:00",
  "created_at": "2008-01-10T11:00:00-05:00",
  "currency": "USD",
  "current_total_discounts": "10.00",
  "current_total_discounts_set": {
    "current_total_discounts_set": {
      "shop_money": {
        "amount": "10.00",
        "currency_code": "CAD"
      },
      "presentment_money": {
        "amount": "5.00",
        "currency_code": "EUR"
      }
    }
  },
  "current_total_duties_set": {
    "current_total_duties_set": {
      "shop_money": {
        "amount": "164.86",
        "currency_code": "CAD"
      },
      "presentment_money": {
        "amount": "105.31",
        "currency_code": "EUR"
      }
    }
  },
  "current_total_price": "10.00",
  "current_total_price_set": {
    "current_total_price_set": {
      "shop_money": {
        "amount": "30.00",
        "currency_code": "CAD"
      },
      "presentment_money": {
        "amount": "20.00",
        "currency_code": "EUR"
      }
    }
  },
  "current_subtotal_price": "10.00",
  "current_subtotal_price_set": {
    "current_subtotal_price_set": {
      "shop_money": {
        "amount": "30.00",
        "currency_code": "CAD"
      },
      "presentment_money": {
        "amount": "20.00",
        "currency_code": "EUR"
      }
    }
  },
  "current_total_tax": "10.00",
  "current_total_tax_set": {
    "current_total_tax_set": {
      "shop_money": {
        "amount": "30.00",
        "currency_code": "CAD"
      },
      "presentment_money": {
        "amount": "20.00",
        "currency_code": "EUR"
      }
    }
  },
  "customer": {
    "id": 207119551,
    "email": "bob.norman@mail.example.com",
    "accepts_marketing": false,
    "created_at": "2012-03-13T16:09:55-04:00",
    "updated_at": "2012-03-13T16:09:55-04:00",
    "first_name": "Bob",
    "last_name": "Norman",
    "state": "disabled",
    "note": null,
    "verified_email": true,
    "multipass_identifier": null,
    "tax_exempt": false,
    "tax_exemptions": {},
    "phone": "+13125551212",
    "tags": "loyal",
    "currency": "USD",
    "addresses": {},
    "admin_graphql_api_id": "gid://shopify/Customer/207119551",
    "default_address": {}
  },
  "customer_locale": "en-CA",
  "discount_applications": {
    "discount_applications": [
      {
        "type": "manual",
        "title": "custom discount",
        "description": "customer deserved it",
        "value": "2.0",
        "value_type": "fixed_amount",
        "allocation_method": "across",
        "target_selection": "explicit",
        "target_type": "line_item"
      },
      {
        "type": "script",
        "description": "my scripted discount",
        "value": "5.0",
        "value_type": "fixed_amount",
        "allocation_method": "across",
        "target_selection": "explicit",
        "target_type": "shipping_line"
      },
      {
        "type": "discount_code",
        "code": "SUMMERSALE",
        "value": "10.0",
        "value_type": "fixed_amount",
        "allocation_method": "across",
        "target_selection": "all",
        "target_type": "line_item"
      }
    ]
  },
  "discount_codes": [
    {
      "code": "SPRING30",
      "amount": "30.00",
      "type": "fixed_amount"
    }
  ],
  "email": "bob.norman@mail.example.com",
  "estimated_taxes": false,
  "financial_status": "authorized",
  "fulfillments": [
    {
      "created_at": "2012-03-13T16:09:54-04:00",
      "id": 255858046,
      "order_id": 450789469,
      "status": "failure",
      "tracking_company": "USPS",
      "tracking_number": "1Z2345",
      "updated_at": "2012-05-01T14:22:25-04:00"
    }
  ],
  "fulfillment_status": "partial",
  "gateway": "shopify_payments",
  "id": 450789469,
  "landing_site": "http://www.example.com?source=abc",
  "line_items": [
    {
      "fulfillable_quantity": 1,
      "fulfillment_service": "amazon",
      "fulfillment_status": "fulfilled",
      "grams": 500,
      "id": 669751112,
      "price": "199.99",
      "product_id": 7513594,
      "quantity": 1,
      "requires_shipping": true,
      "sku": "IPOD-342-N",
      "title": "IPod Nano",
      "variant_id": 4264112,
      "variant_title": "Pink",
      "vendor": "Apple",
      "name": "IPod Nano - Pink",
      "gift_card": false,
      "price_set": {
        "shop_money": {
          "amount": "199.99",
          "currency_code": "USD"
        },
        "presentment_money": {
          "amount": "173.30",
          "currency_code": "EUR"
        }
      },
      "properties": [
        {
          "name": "custom engraving",
          "value": "Happy Birthday Mom!"
        }
      ],
      "taxable": true,
      "tax_lines": [
        {
          "title": "HST",
          "price": "25.81",
          "price_set": {
            "shop_money": {
              "amount": "25.81",
              "currency_code": "USD"
            },
            "presentment_money": {
              "amount": "20.15",
              "currency_code": "EUR"
            }
          },
          "channel_liable": true,
          "rate": 0.13
        }
      ],
      "total_discount": "5.00",
      "total_discount_set": {
        "shop_money": {
          "amount": "5.00",
          "currency_code": "USD"
        },
        "presentment_money": {
          "amount": "4.30",
          "currency_code": "EUR"
        }
      },
      "discount_allocations": [
        {
          "amount": "5.00",
          "discount_application_index": 2,
          "amount_set": {
            "shop_money": {
              "amount": "5.00",
              "currency_code": "USD"
            },
            "presentment_money": {
              "amount": "3.96",
              "currency_code": "EUR"
            }
          }
        }
      ],
      "origin_location": {
        "id": 1390592786454,
        "country_code": "CA",
        "province_code": "ON",
        "name": "Apple",
        "address1": "700 West Georgia Street",
        "address2": "1500",
        "city": "Toronto",
        "zip": "V7Y 1G5"
      },
      "duties": [
        {
          "id": "2",
          "harmonized_system_code": "520300",
          "country_code_of_origin": "CA",
          "shop_money": {
            "amount": "164.86",
            "currency_code": "CAD"
          },
          "presentment_money": {
            "amount": "105.31",
            "currency_code": "EUR"
          },
          "tax_lines": [
            {
              "title": "VAT",
              "price": "16.486",
              "rate": 0.1,
              "price_set": {
                "shop_money": {
                  "amount": "16.486",
                  "currency_code": "CAD"
                },
                "presentment_money": {
                  "amount": "10.531",
                  "currency_code": "EUR"
                }
              },
              "channel_liable": true
            }
          ],
          "admin_graphql_api_id": "gid://shopify/Duty/2"
        }
      ]
    }
  ],
  "location_id": 49202758,
  "merchant_of_record_app_id": 1966818,
  "name": "#1001",
  "note": "Customer changed their mind.",
  "note_attributes": [
    {
      "name": "custom name",
      "value": "custom value"
    }
  ],
  "number": 1,
  "order_number": 1001,
  "original_total_duties_set": {
    "original_total_duties_set": {
      "shop_money": {
        "amount": "164.86",
        "currency_code": "CAD"
      },
      "presentment_money": {
        "amount": "105.31",
        "currency_code": "EUR"
      }
    }
  },
  "payment_details": {
    "avs_result_code": "Y",
    "credit_card_bin": "453600",
    "cvv_result_code": "M",
    "credit_card_number": "•••• •••• •••• 4242",
    "credit_card_company": "Visa"
  },
  "payment_terms": {
    "amount": 70,
    "currency": "CAD",
    "payment_terms_name": "NET_30",
    "payment_terms_type": "NET",
    "due_in_days": 30,
    "payment_schedules": [
      {
        "amount": 70,
        "currency": "CAD",
        "issued_at": "2020-07-29T13:02:43-04:00",
        "due_at": "2020-08-29T13:02:43-04:00",
        "completed_at": "null",
        "expected_payment_method": "shopify_payments"
      }
    ]
  },
  "payment_gateway_names": [
    "authorize_net",
    "Cash on Delivery (COD)"
  ],
  "phone": "+557734881234",
  "presentment_currency": "CAD",
  "processed_at": "2008-01-10T11:00:00-05:00",
  "processing_method": "direct",
  "referring_site": "http://www.anexample.com",
  "refunds": [
    {
      "id": 18423447608,
      "order_id": 394481795128,
      "created_at": "2018-03-06T09:35:37-05:00",
      "note": null,
      "user_id": null,
      "processed_at": "2018-03-06T09:35:37-05:00",
      "refund_line_items": [],
      "transactions": [],
      "order_adjustments": []
    }
  ],
  "shipping_address": {
    "address1": "123 Amoebobacterieae St",
    "address2": "",
    "city": "Ottawa",
    "company": null,
    "country": "Canada",
    "first_name": "Bob",
    "last_name": "Bobsen",
    "latitude": "45.41634",
    "longitude": "-75.6868",
    "phone": "555-625-1199",
    "province": "Ontario",
    "zip": "K2P0V6",
    "name": "Bob Bobsen",
    "country_code": "CA",
    "province_code": "ON"
  },
  "shipping_lines": [
    {
      "code": "INT.TP",
      "price": "4.00",
      "price_set": {
        "shop_money": {
          "amount": "4.00",
          "currency_code": "USD"
        },
        "presentment_money": {
          "amount": "3.17",
          "currency_code": "EUR"
        }
      },
      "discounted_price": "4.00",
      "discounted_price_set": {
        "shop_money": {
          "amount": "4.00",
          "currency_code": "USD"
        },
        "presentment_money": {
          "amount": "3.17",
          "currency_code": "EUR"
        }
      },
      "source": "canada_post",
      "title": "Small Packet International Air",
      "tax_lines": [],
      "carrier_identifier": "third_party_carrier_identifier",
      "requested_fulfillment_service_id": "third_party_fulfillment_service_id"
    }
  ],
  "source_name": "instagram",
  "source_identifier": "ORDERID-123",
  "source_url": "{URL_to_order}",
  "subtotal_price": 398,
  "subtotal_price_set": {
    "shop_money": {
      "amount": "141.99",
      "currency_code": "CAD"
    },
    "presentment_money": {
      "amount": "90.95",
      "currency_code": "EUR"
    }
  },
  "tags": "imported, vip",
  "tax_lines": [
    {
      "price": 11.94,
      "rate": 0.06,
      "title": "State Tax",
      "channel_liable": true
    }
  ],
  "taxes_included": false,
  "test": true,
  "token": "b1946ac92492d2347c6235b4d2611184",
  "total_discounts": "0.00",
  "total_discounts_set": {
    "shop_money": {
      "amount": "0.00",
      "currency_code": "CAD"
    },
    "presentment_money": {
      "amount": "0.00",
      "currency_code": "EUR"
    }
  },
  "total_line_items_price": "398.00",
  "total_line_items_price_set": {
    "shop_money": {
      "amount": "141.99",
      "currency_code": "CAD"
    },
    "presentment_money": {
      "amount": "90.95",
      "currency_code": "EUR"
    }
  },
  "total_outstanding": "5.00",
  "total_price": "409.94",
  "total_price_set": {
    "shop_money": {
      "amount": "164.86",
      "currency_code": "CAD"
    },
    "presentment_money": {
      "amount": "105.31",
      "currency_code": "EUR"
    }
  },
  "total_shipping_price_set": {
    "shop_money": {
      "amount": "30.00",
      "currency_code": "USD"
    },
    "presentment_money": {
      "amount": "0.00",
      "currency_code": "USD"
    }
  },
  "total_tax": "11.94",
  "total_tax_set": {
    "shop_money": {
      "amount": "18.87",
      "currency_code": "CAD"
    },
    "presentment_money": {
      "amount": "11.82",
      "currency_code": "EUR"
    }
  },
  "total_tip_received": "4.87",
  "total_weight": 300,
  "updated_at": "2012-08-24T14:02:15-04:00",
  "user_id": 31522279,
  "order_status_url": {
    "order_status_url": "https://checkout.shopify.com/112233/checkouts/4207896aad57dfb159/thank_you_token?key=753621327b9e8a64789651bf221dfe35"
  }
}

Destination: MS SQL Relational DB

Containerized Database with Docker

In order to make a fully automated system, I needed to be able to transfer it between my systems without going through the installation/configuration process each time. For this, Docker makes it easy. I created a Dockerfile based on Microsoft’s SQL Server Docker image and layered on the python library and requirements. The SQL script to create the database and tables automatically runs when the container first starts. The user/password and other configuration options are set through environment variables in docker-compose.yml.

Dockerfile

FROM ubuntu:20.04 AS builder

ARG DEBIAN_FRONTEND=noninteractive
USER root

RUN apt-get update && \
    apt-get install --no-install-recommends -y \ 
    python3.9 python3.9-dev python3.9-venv \
    python3-pip python3-wheel build-essential unixodbc-dev \
    && apt-get clean

RUN python3.9 -m venv /root/venv
ENV PATH="/root/venv/bin:$PATH"

COPY requirements.txt .
RUN pip3 install --no-cache-dir wheel
RUN pip3 install --no-cache-dir -r requirements.txt

FROM mcr.microsoft.com/mssql/server:2019-latest as mssql

ENV DB_HOST=localhost
ENV ACCEPT_EULA=Y

EXPOSE 1433
USER root

RUN apt update && apt upgrade -y && \
    ACCEPT_EULA=Y apt install -y \
    curl cron unixodbc-dev msodbcsql18 \
    python3.9 python3.9-dev python3-pip python3.9-venv \
    && curl https://bootstrap.pypa.io/get-pip.py | python3 \
    && echo PATH="$PATH:/opt/mssql-tools/bin" >> ~/.bash_profile \
    && echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc \
    && apt-get clean

COPY --from=builder /root/venv /root/venv
ENV PYTHONBUFFERED=1
ENV VIRTUAL_ENV=/root/venv
ENV PATH="/root/venv/bin:$PATH"

COPY scripts /usr/scripts

RUN chmod +x /usr/scripts/entrypoint.sh && chmod +x /usr/scripts/configure-db.sh

WORKDIR /root/app

COPY setup.py .
COPY README.md .
COPY requirements.txt .
COPY src ./src

RUN pip3 install --no-cache-dir -r requirements.txt && python3 setup.py install

WORKDIR /root

ENTRYPOINT ["/usr/scripts/entrypoint.sh"]

The Dockerfile builds the Microsoft SQL image and installs the drivers, python 3.9 and the applications needed to run the python script and update the tables.  The set up and python scripts are copied from a local directory into the container and made executable. The virtual environment is transferred from a previous build stage and the remaining requirements are installed.

 The entrypoint.sh script is where the magic happens. The script runs an SQL script file that builds the database structure and tables, in addition to setting environment variables that can be used for configuring the script.

docker-compose.yml

services:
    sqlserver:
        build: .
        container_name: shop_sql
        hostname: shop_sql
        ports:
            - "1433:1433"
        environment:
            ACCEPT_EULA: "Y"
            SA_PASSWORD: "TheStrongestpassword1!"
            MSSQL_SA_PASSWORD: "TheStrongestpassword1!"
            SHOPIFY_DB_USER: "shop_user"
            SHOPIFY_DB_PASSWORD: "Strongerpassword1!"
            SHOPIFY_DB_NAME: "shop_rest"
            SHOPIFY_DB_SCHEMA: "dbo"
            STARTING_DATE: "2019-01-01"
            CONFIG_FILE: "/root/config.ini"
        volumes:
            - "./csv_export:/root/csv_export"
            - "./data:/var/opt/mssql"
            - "./config.ini:/root/config.ini"

In order to completely automate deployment, docker compose is used to set environment variables and configure the database logins. 

The MSSQL_SA_PASSWORD is used to set the administrative password to the SQL Server instance. The environment variables, SHOPIFY_DB_USER, PASSWORD, NAME & SCHEMA configure the name and login to the Shopify database. The Shopify user will only have access to the Shopify database. The STARTING_DATE environment variable is used to configure the starting date of the date dimension table that’s automatically built with the database. Note that the passwords set must meet the minimum SQL Server requirements with a lowercase letter, uppercase letter, number and symbol.

The script configuration is done through a config.ini file that can be mounted from the working directory or where defined in docker-compose.yml.  The script also pulls in any environment variables from the container set by docker-compose.yml

The data volume can be mounted locally for portability through the “./data:/var/opt/mssql” volume. Exporting to a CSV is also possible by mounting the csv export directory inside of the container to a local directory. In order to run this in Windows, the mssql data folder has to be broken out into “/var/opt/mssql/data”, “/var/opt/mssql/secrets”, & “/var/opt/mssql/log”. The correct volumes section of docker-compose is shown below.


        volumes:
            - "./mssql/data:/var/opt/mssql/data"
            - "./mssql/log:/var/opt/mssql/log"
            - "./mssql/secrets:/var/opt/mssql/secrets"
            - "./csv_export:/root/csv_export"
            - "./config.ini:/root/config.ini"

Configuring the Python Application

The python application can be completely configured through an ini-based file. An example of the configuration file is below. Any environment variables defined in docker-compose.yml are automatically pulled if not configured in this file.

The store_name option is the first part of your Shopify admin URL. For example MY-STORE.myshopify.com, even if your custom URL is different. The access_token is obtained by creating a private app in the settings of the Shopify Admin portal with the appropriate permissions.

config.ini

[shopify]
# Orders API Endpoint
store_name = STORE_NAME_IN_ADMIN_URL
api_version = 2022-07
orders_ep = orders.json
api_path = /admin/api/
customers_ep = customers.json
time_zone = America/New_York
# Shopify Access Token
access_token = ***ACCESS_TOKEN***

# Items Returned per API Call - 250 Max
items_per_page = 250

# Uncomment to get data between two dates
# Overrides past days of history
# start = 20210610
# end = 20210617

# Get past days of history
days = 30

[sql]
# Driver for pyodbc to use
driver = ODBC Driver 17 for SQL Server
# Database server & port - localhost for docker container
server = ***SERVER_ADDRESS***
port = 1433
database = shop_rest
schema = dbo
# Database user & password
db_user = sa
db_pass = ***DATABASE_PW***

[csv]
# Relative filepath of csv folder output
filepath = csv_export

Making It All Work

Now that the technical details are out of the way, let’s see if I succeeded in making a completely automated Shopify database that requires little to no user intervention. Ensure that you have docker and git installed before running this. You can find the Docker installation instructions here: https://docs.docker.com/get-docker/If you don’t have git and don’t want to install, you can download and extract a zip of the github repo here.

Edit docker-compose.yml located in the root directory of the repo to set environment variables.  Any text editor besides vim can be used.

$ cd ~ && git clone https://github.com/webdjoe/pyshopify.git
$ cd pyshopify
$ vim docker-compose.yml

Edit config.ini file to configure the script. The template is located in the root of the repo directory.

$ vim config.ini

Build the container with docker compose

$ sudo docker compose build

Run the container detached

$ sudo docker up -d

Give it sufficient time to start up, especially on the first run when it is building the database and tables. Test to see if SQL server is running and ready to accept connections.

$ sudo docker exec -it shop_sql bash \
    -c '/opt/mssql-tools/bin/sqlcmd -h -1 -t 1 -U sa \
    -P "$MSSQL_SA_PASSWORD" \
    -Q "SET NOCOUNT ON; Select SUM(state) from sys.databases"'

The output of the previous command should be just a ‘0’, meaning that SQL Server has started successfully and ready to accept connections. If anything else is returned, check the docker logs to see what might have gone wrong.

Ready to Start Pulling Data

If the database has started successfully, you are now ready to populate the database. Since Shopify uses a paginated API, it will take quite some time to pull in historical data if you have a high order volume. Pulling data is as simple as running “shopify_cli” from inside the container. 

$ sudo docker exec -it shop_sql bash shopify_cli --help
Usage: shopify_cli [OPTIONS]

  Run Shopify App CLI.

Options:
  -d, --days INTEGER     get days of history
  -b, --between TEXT...  get between 2 dates - yyyy-MM-dd, ex -b 2020-01-01
                         2020-01-02
  --sql-out / --no-sql   write to database
  --csv-out / --no-csv   Write results to csv files
  --csv-location TEXT    Relative location of csv export folder defaults to
                         csv_export/
  --config TEXT          Relative location of config.ini - defaults to
                         config.ini in currect directory
  --help                 Show this message and exit.

Run in container to pull last 30 days of data
$ sudo docker exec -it shop_sql bash shop_cli -d 30 --sql-out

Run in container to pull dates between Jan 1, 2022 and Jan 31, 2022
$ sudo docker exec -it shop_sql bash shop_cli -b 2022-01-01 2022-01-31

Once you see the help contents, run a relatively small timeframe to see it proceed to completion. The timeframe can be set with the -b/–between argument to get the data between two dates or -d/–days to get the last days of data. This will pull the last 5 days of data into the database. You should see an output line each time data is written to a table.

That’s It!

Bonus: Script to Add Cron Job

It wouldn’t be completely automated if I had to run a command every day. So I included a script that sets up a cron job to run every morning at 4AM local time. Install the cron job by running a bash script that is included inside the container. The first output will probably be ‘no crontab for root’, this is just testing to see if a cron job has already been set up for shopify_cli. You should see ‘Adding cron job’ next, meaning it’s been successfully added. Verify by running ‘crontab -l’ inside of the container

$ sudo docker exec -it shop_sql bash /usr/scripts/add-cron.sh

$ sudo docker exec -it shop_sql crontab -l

Conclusion

In a few simple steps, pyshopify creates a fully self contained relational database of Shopify data that can be updated regularly with minimal intervention and redeployed with little-to-no configuration. The real magic happens with the data inside, the next article will be on some useful queries and ways to integrate this Shopify data in Google Sheets, Excel, Power Query or any application that can connect to SQL Server.

Useful Queries

Once the data is pulled into the database, the fun begins. The following are examples of some common queries that can be used to gain insights from Shopify data. They are not fully optimized but they demonstrate the power of the database.