Quickstart ========== Hi, welcome to the twelve-minute quick-start tutorial. Connecting to a database ------------------------ At first you need to import the dataset package :) :: import dataset To connect to a database you need to identify it by its `URL `_, which basically is a string of the form ``"dialect://user:password@host/dbname"``. Here are a few examples for different database backends:: # connecting to a SQLite database db = dataset.connect('sqlite:///mydatabase.db') # connecting to a MySQL database with user and password db = dataset.connect('mysql://user:password@localhost/mydatabase') # connecting to a PostgreSQL database db = dataset.connect('postgresql://scott:tiger@localhost:5432/mydatabase') It is also possible to define the `URL` as an environment variable called `DATABASE_URL` so you can initialize database connection without explicitly passing an `URL`:: db = dataset.connect() Depending on which database you're using, you may also have to install the database bindings to support that database. SQLite is included in the Python core, but PostgreSQL requires ``psycopg2`` to be installed. MySQL can be enabled by installing the ``mysql-db`` drivers. Storing data ------------ To store some data you need to get a reference to a table. You don't need to worry about whether the table already exists or not, since dataset will create it automatically:: # get a reference to the table 'user' table = db['user'] Now storing data in a table is a matter of a single function call. Just pass a `dict`_ to *insert*. Note that you don't need to create the columns *name* and *age* – dataset will do this automatically:: # Insert a new record. table.insert(dict(name='John Doe', age=46, country='China')) # dataset will create "missing" columns any time you insert a dict with an unknown key table.insert(dict(name='Jane Doe', age=37, country='France', gender='female')) .. _dict: http://docs.python.org/2/library/stdtypes.html#dict Updating existing entries is easy, too:: table.update(dict(name='John Doe', age=47), ['name']) The list of filter columns given as the second argument filter using the values in the first column. If you don't want to update over a particular value, just use the auto-generated ``id`` column. Using Transactions ------------------ You can group a set of database updates in a transaction. In that case, all updates are committed at once or, in case of exception, all of them are reverted. Transactions are supported through a context manager, so they can be used through a ``with`` statement:: with dataset.connect() as tx: tx['user'].insert(dict(name='John Doe', age=46, country='China')) You can get same functionality by invoking the methods :py:meth:`begin() `, :py:meth:`commit() ` and :py:meth:`rollback() ` explicitly:: db = dataset.connect() db.begin() try: db['user'].insert(dict(name='John Doe', age=46, country='China')) db.commit() except: db.rollback() Nested transactions are supported too:: db = dataset.connect() with db as tx1: tx1['user'].insert(dict(name='John Doe', age=46, country='China')) with db as tx2: tx2['user'].insert(dict(name='Jane Doe', age=37, country='France', gender='female')) Inspecting databases and tables ------------------------------- When dealing with unknown databases we might want to check their structure first. To start exploring, let's find out what tables are stored in the database: >>> print(db.tables) [u'user'] Now, let's list all columns available in the table ``user``: >>> print(db['user'].columns) [u'id', u'country', u'age', u'name', u'gender'] Using ``len()`` we can get the total number of rows in a table: >>> print(len(db['user'])) 2 Reading data from tables ------------------------ Now let's get some real data out of the table:: users = db['user'].all() If we simply want to iterate over all rows in a table, we can omit :py:meth:`all() `:: for user in db['user']: print(user['age']) We can search for specific entries using :py:meth:`find() ` and :py:meth:`find_one() `:: # All users from China chinese_users = table.find(country='China') # Get a specific user john = table.find_one(name='John Doe') Using :py:meth:`distinct() ` we can grab a set of rows with unique values in one or more columns:: # Get one user per country db['user'].distinct('country') Running custom SQL queries -------------------------- Of course the main reason you're using a database is that you want to use the full power of SQL queries. Here's how you run them with ``dataset``:: result = db.query('SELECT country, COUNT(*) c FROM user GROUP BY country') for row in result: print(row['country'], row['c']) The :py:meth:`query() ` method can also be used to access the underlying `SQLAlchemy core API `_, which allows for the programmatic construction of more complex queries:: table = db['user'].table statement = table.select(table.c.name.like('%John%')) result = db.query(statement) Exporting data -------------- While playing around with our database in Python is a nice thing, they are sometimes just a processing stage until we go on to use it in another place, say in an interactive web application. To make this seamless, ``dataset`` supports serializing rows of data into static JSON and CSV files such using the :py:meth:`freeze() ` function:: # export all users into a single JSON result = db['users'].all() dataset.freeze(result, format='json', filename='users.json') You can create one file per row by setting ``mode`` to "item":: # export one JSON file per user dataset.freeze(result, format='json', filename='users/{{ id }}.json', mode='item') Since this is a common operation we made it available via command line utility ``datafreeze``. Read more about the :doc:`freezefile markup `. .. code-block:: bash $ datafreeze freezefile.yaml