py_dataset
is the python version of dataset. dataset is a way of storing and organazing JSON documents
on disc. To use py_dataset
we usually import it using Python's
"from" syntax.
from py_dataset import dataset
This provides a dataset
object to work with dataset collections.
This is an example of creating a dataset called fiends.ds, saving a record called "littlefreda.json" and reading it back.
import sys
import json
from py_dataset import dataset
# Creating our friends.ds dataset collection for the first time.
c_name = 'friends.ds'
if not dataset.init(c_name, dsn = ""):
print(dataset.error_message())
sys.exit(1)
# Now let's add something to our collection.
key = 'littlefreda'
record = {"name":"Freda","email":"little.freda@inverness.example.org"}
if not dataset.create(c_name, key, record):
print(dataset.error_message())
sys.exit(1)
# We should have at least one record in our collection.
# This is the idiom for iterating and working with our collection
# objects.
keys = dataset.keys(c_name)
for key in keys:
p = dataset.path(c_name, key)
print(p)
# NOTE: the "read" method returns a touple!
record, err := dataset.read(c_name, key)
if err != '':
print(f"read error, {err}")
sys.exit(1)
print(f"Doc: {record}")
The command dataset.init(c_name, dsn = "")
, dataset.keys(c_name)
,
dataset.read(c_name, key)
dataset.create(c_name, key)
are the
main actors here. Most dataset methods require
the collection name as the first parameter. Likewise many return
some sort of value. If it is a boolean value than True means
success and False means failure. If the method returns data then
often it will be returned as a touple like with read()
.
If an error has occurred (e.g. permissions on disc raising a problem)
you can retrieve the dataset error message by using the error_message()
function. If you're done with the error you can use error_clear()
to reset the error message queue.
Now check to see if the key, littlefreda, is in the collection
dataset.has_key(c_name, 'littlefreda')
You can also read your JSON formatted data from a file but you need to convert it first to a Python dict. In theses examples we are creating for Mojo Sam and Capt. Jack then reading back all the keys and displaying their paths and the JSON document created.
with open("mojosam.json") as f:
src = f.read().encoding('utf-8')
dataset.create(c_name, "mojosam", json.loads(src))
with open("capt-jack.json") as f:
src = f.read()
dataset.create("capt-jack", json.loads(src))
for key in dataset.keys(c_name):
print(f"Path: {dataset.path(c_name, key)}")
print(f"Doc: {dataset.read(c_name, key)}")
print("")
NOTE: In v2 of dataset there is no internal mechansim for filting and sorta keys. If you need that you should create a data frame, read the data frame out and manipulate it. Internal sorting and filtering just proved too slow.