This tutorial will show you how to manipulate and query data in an Azure SQL database using Python and Jupyter Notebooks.
In this tutorial, you will learn how to use Python and Jupyter Notebooks to insert a dataframe into an Azure SQL table. To complete the exercise, you will need to install:
- Python 3 (make sure to add Python to your PATH),
- Visual Studio Code,
- Microsoft ODBC Driver for SQL Server and the pyodbc library,
- Jupyter Notebook and Jupyter Extension for Visual Studio Code,
- the Python pandas package using
If you need help with installing these tools, follow the instructions in the tutorial Connect to Azure SQL Database using Python and Jupyter Notebook.
You can download the code from my GitHub repository .
Open Visual Studio Code and create a new Jupyter file named insert-data-sql.ipynb. In the first cell, add the following code to connect to your database:
Create a cursor object
Cursor objects can interact with the SQL server and execute operations such as SQL statements. To create a cursor, use the
cursor() method of the
To execute SQL statements you can use the
execute() method of the cursor object.
Create a table
The following example creates a new table named
movies. The items between the parentheses specify the details of each column. The table has 11 columns.
id column is used to uniquely identify a row (
PRIMARY KEY) and it is an integer (
IDENTITY(1,1) statement allows a unique number to be created automatically when a new record is inserted into the table.
Load dataframe from CSV file
Assume that we have the movies.csv file located in the same folder as the Python program. Use the
pandas.read_csv() function to create a dataframe and load the CSV file.
Transform columns and change data types
Before inserting data into the SQL table, it is important to make sure you are using the correct data types. Use the
df.dtypes command to see all the data types in the
df dataframe. We are going to change the data type of the
convert_runtime() function, which converts the
runtime (string) to integer.
Use the following code to apply the
convert_runtime() function to the
Then, use the following code to specify the genre of a movie.
convert_gross() function, which converts the
gross (string) to integer
and use the following code to apply the
convert_gross() function to the
To load data, use the
INSERT SQL statement.
INSERT statement, you specify a table and columns in an
INTO clause, and a list of values to be stored in these columns in a
Commit the changes
It is important to use the `commit()`` method after every transaction that modifies data.
You can read the data by using a
SELECT statement. The following code selects the
movie_title columns of the table and reads the data using the
Then you can use a simple
for loop to iterate over the rows and print the data.
In the next tutorial, you will use SQL statements in Jupyter Notebooks to manipulate and query data and run aggregate functions.