62 lines
2.6 KiB
ReStructuredText
Raw Normal View History

2013-04-01 18:43:01 +02:00
.. dataset documentation master file, created by
sphinx-quickstart on Mon Apr 1 18:41:21 2013.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
2013-04-02 23:45:44 +02:00
dataset: databases for lazy people
==================================
2013-04-02 00:44:22 +02:00
2013-04-02 23:45:44 +02:00
.. toctree::
:hidden:
2013-04-03 22:47:28 +02:00
Although managing data in relational database has plenty of benefits, they're rarely used in day-to-day work with small to medium scale datasets. But why is that? Why do we see an awful lot of data stored in static files in CSV or JSON format, even though they are hard
to query and update incrementally?
2013-04-02 23:45:44 +02:00
2013-04-03 22:47:28 +02:00
The answer is that **programmers are lazy**, and thus they tend to prefer the easiest solution they find. And in **Python**, a database isn't the simplest solution for storing a bunch of structured data. This is what **dataset** is going to change!
2013-04-02 23:45:44 +02:00
2013-04-03 22:47:28 +02:00
In short, dataset combines the straightforwardness of JSON files or a NoSQL store with the full power and flexibility of relational databases.
2013-04-02 23:45:44 +02:00
2013-04-03 00:24:23 +02:00
::
2013-04-01 18:43:01 +02:00
2013-04-02 11:10:29 +02:00
import dataset
2013-04-03 14:28:42 +02:00
db = dataset.connect('sqlite:///:memory:')
table = db['sometable']
table.insert(dict(name='John Doe', age=37))
table.insert(dict(name='Jane Doe', age=34, gender='female'))
john = table.find_one(name='John Doe')
2013-04-02 23:45:44 +02:00
2013-04-03 00:24:23 +02:00
Here is `similar code, without dataset <https://gist.github.com/gka/5296492>`_.
2013-04-02 23:45:44 +02:00
2013-04-02 11:10:29 +02:00
2013-04-02 13:17:41 +02:00
Features
--------
2013-04-02 11:10:29 +02:00
2013-04-02 23:45:44 +02:00
* **Automatic schema**: If a table or column is written that does not
2013-04-02 11:10:29 +02:00
exist in the database, it will be created automatically.
2013-04-03 00:24:23 +02:00
* **Upserts**: Records are either created or updated, depending on
2013-04-02 11:10:29 +02:00
whether an existing version can be found.
2013-04-03 00:24:23 +02:00
* **Query helpers** for simple queries such as :py:meth:`all <dataset.Table.all>` rows in a table or
all :py:meth:`distinct <dataset.Table.distinct>` values across a set of columns.
2013-04-03 02:05:37 +02:00
* **Compatibility**: Being built on top of `SQLAlchemy <http://www.sqlalchemy.org/>`_, ``dataset`` works with all major databases, such as SQLite, PostgreSQL and MySQL.
2013-04-01 22:40:28 +02:00
2013-04-02 13:17:41 +02:00
Contents
--------
2013-04-01 22:40:28 +02:00
.. toctree::
:maxdepth: 2
2013-04-02 13:17:41 +02:00
quickstart
api
2013-04-01 18:43:01 +02:00
Contributors
------------
2013-04-03 14:04:52 +02:00
``dataset`` is written and maintained by `Friedrich Lindenberg <https://github.com/pudo>`_ and `Gregor Aisch <https://github.com/gka>`_. Its code is largely based on the preceding libraries `sqlaload <https://github.com/okfn/sqlaload>`_ and `datafreeze <https://github.com/spiegelonline/datafreeze>`_. And of course, we're standing on the `shoulders of giants <http://www.sqlalchemy.org/>`_.
2013-04-03 22:47:28 +02:00
Our cute little `naked mole rat <http://www.youtube.com/watch?feature=player_detailpage&v=A5DcOEzW1wA#t=14s>`_ was drawn by `Johannes Koch <http://chechuchape.com/>`_.