1.1 What is SQL?

Structured Query Language (SQL) is a standard computer language for relational database management and data manipulation. SQL is used often to query, insert, update and modify data. At a basic level, SQL is a way to communicate with database. Many SQL commands are descriptive words and easy to interpret compared to many other computer languages. This makes SQL an easy to understand and learn language.

Another important thing to know about SQL is that it is a non-procedural language. That means you won’t be able to write complete applications with it. This makes SQL relatively simple but also very powerful language to interact with data. SQL is all about data and it is used for three things:

  • read and retrieve data from database
  • write data into database
  • update and insert new data

Different SQL users

There are a lot of jobs requre SQL and it is not just for data science. It is important to understand how different roles might use SQL in their jobs. The users can be data scientist, programmers, backend developer, QA engineers, data architects, system engineers and database administrators (DBA). I want to talk a little more about how DBA use SQL comparing to data scientist.

  • A DBA is responsible for managing the entire database and guarding it.
  • A data scientist, on the other hand, is typically a user of that database.

The DBA will be responsible for giving permissions to people and determining who has access to what data. They are often responsible for managing the tables and creating them. Data scientist need to get the rights from DBA to create his/her own table and insert data into them. The two roles are similar in that they both use SQL to interact with the data. But the main difference is that the data scientist is really the end user. Whereas the DBA is the one who administers it, governs it and manages the database, as a whole.

Database Admin Data Scientist
Manages/governs entire database End user of a database
Gives permissions to users Retrieve data (mainly)
Determines access to data May create their own table or test environment
Manages and creates tables Combine multiple sources together
Uses SQL to query and retrive data Writes complex queries for analysis (maybe but usually not)

SQL and Database Management System(DBMS)

Despite SQL being standardized since 1986, a lot of different implementations exist. They deviate more or less from each other. You can think of SQL as the interpreter between you and the database. How you write some of the syntax for SQL depends on the relational database management system you are using. Here are some of the popular ones:

  • SQL Server
  • IBM DB2 Oracle
  • Sybase ASE
  • PostgreSQL
  • MySQL
  • Microsoft SQL Server
  • Apache Open Office Base
  • SQLite

In this text, we’ll use SQLite. It’s important to understand that if you copy code from this text into another application at work, it may not work correctly. You should check the type of DBMS you’re using and see if that makes a difference. We will talk about this more when we get to the syntax later including some of the ways to figure out what those differences might be.

Summary points:

  • How you write syntax will depend on what DBMS you are using
  • Each DBMS has its own “dialect”
  • SQL can translate
  • You will tweak based on the “dialect” your DMBS speaks