Pysql
I was exploring data from a CSV, I used Jupyter notebook as I’m a bit familiar with it and it’s a great fit for this kind of use case.
It’s very easy to load data in a dataframe, however I don’t use pandas often enough and cannot remember all the functions.
I just wanted to write SQL queries from the dataframe itself.
import matplotlib.pyplot
import pandas as pd
from pandasql import sqldf
df = pd.read_csv('data.csv')
pysqldf = lambda q: sqldf(q, globals())
q = """SELECT
SUBSTR(deployment_date, 1, 9) as date, SUBSTR(deployment_date, 1, 9) - SUBSTR(created_date, 1, 9) as ct
FROM df
where deploy_type='k8s' and SUBSTR(created_date, 1, 9) > '12/27/2021'
"""
df = pysqldf(q)
It’s a cool project, however I would not using it much for 2 reasons:
- Not maintained last commit was more than 4 years ago
- This is SQLite syntax which is sometimes limiting