There comes a point in testing data products when you become sick of using synthetic data and whatever you can find lying around on Kaggle. If you hit that point and start looking around your apartment for something REAL, you may just spot a Nest thermostat. This is capable of generating real data that you can stream into FeatureBase. This repo will walk you through this process, but there are some prerequisites.
1. Purchase and setup a Nest Thermostat
2. Follow these instructions to set up access to your nest data, which includes a purchase ($5) google developer access
3. Have a FeatureBase Cloud account
4. Have ready access to the following:
All credit to the above blog for helping me get set up to access my nest's data!
You need to have a python environment with the proper packages found in the requirements.txt
If you’re reading this section, you’ve completed the steps above and should feel accomplished already! You’ve bitten the bullet and paid Google for your own data, but you can now do whatever you want with it! First off, you need to get the device you want to pull data from. You can make a call to the smart device endpoint with your project to see all available devices:
This walkthrough and code assumes you only have one thermostat device, but you can easily tweak the code to incorporate multiple devices if desired.
Once you have the correct device name, you can query it to get your device’s stats. This will return data that looks similar to the below payload:
Detailed information about each trait can be found on Google's docs.
With a preview of the data, it’s time to model the data in FeatureBase. You can create a DDL statement with the traits of interest. Below is an example statement with familiar data types that elects the time of the thermostat reading as the table’s primary key (_id):
After inspecting a record, you realize how dependent your brain is on seeing Fahrenheit over Celsius. After going down an internal rabbit hole on why there is a metric system and an imperial system and questioning why the world can’t just get along, you decide that you want the ambient temperature to also show in Fahrenheit, so you add that column to the table:
You are now ready to load data. For FeatureBase you can use BULK INSERT statements, which allow you to stream JSON data into your table. BULK INSERT gives you the flexibility to send 1 to n records, but for the examples that follow, each record will be sent individually. BULK INSERT allows for light data manipulation in the TRANSFORM clause, so you implement the temperature conversion there. An example of sending one record can be seen below:
You are happy with the data model and are officially ready to start streaming data. You want to set and forget this and have data keep loading, so you need a couple of things: a connection to your Nest, a connection to FeatureBase, a way to refresh your connections, and a method to constantly pull and push data. Luckily the nestbase.py script does that all for you.
You are pretty smart, so you check the script out before running and find this will continuously run but only poll Nest every 6 seconds because Google limits device info to 10 QPM: https://developers.google.com/nest/device-access/project/limits . Lastly, you correctly call into question the method in which secrets for Nest and FeatureBase are used and modify the script in accordance with your security needs. You run the script (and use something like caffeinate when on a mac) and leave it be.
Have ready access to the following:
A couple days later you can return to play around with your recently loaded data. You have a couple of questions you want to answer:
docker run -p 10101:10101 featurebasedb/featurebase
git clone https://github.com/FeatureBaseDB/featurebase-examples.git
docker network create fbnet
docker compose up