I was exploring data from a CSV, I used Jupyter notebook as I’m a bit familiar with it and it’s a great fit for this kind of use case.
It’s very easy to load data in a dataframe, however I don’t use pandas often enough and cannot remember all the functions.
I just wanted to write SQL queries from the dataframe itself.
import matplotlib.pyplot import pandas as pd from pandasql import sqldf df = pd.read_csv('data.csv') pysqldf = lambda q: sqldf(q, globals()) q = """SELECT SUBSTR(deployment_date, 1, 9) as date, SUBSTR(deployment_date, 1, 9) - SUBSTR(created_date, 1, 9) as ct FROM df where deploy_type='k8s' and SUBSTR(created_date, 1, 9) > '12/27/2021' """ df = pysqldf(q)
It’s a cool project, however I would not using it much for 2 reasons:
- Not maintained last commit was more than 4 years ago
- This is SQLite syntax which is sometimes limiting